fix: gate public skills in conversations by enyst · Pull Request #531 · OpenHands/agent-canvas

enyst · 2026-05-16T21:31:37Z

Summary

gate agent_context.load_public_skills behind shouldLoadPublicSkills() when starting conversations
keep public skills enabled by default so existing conversations retain current behavior
let deployers opt out with VITE_LOAD_PUBLIC_SKILLS=false, while the global Skills page still loads the public catalog
add regression coverage for the default-on and explicit-opt-out cases

Important: there is no eval risk because default behavior does not change, this only allows users to opt out of public skills if they want to.

Testing

npm run make-i18n
npx --yes vitest run tests/api/agent-server-config.test.ts tests/api/agent-server-adapter.test.ts tests/api/skills-service.test.ts
npm run typecheck

Refs #211

This PR was updated by an AI agent (OpenHands) on behalf of the user.

Co-authored-by: openhands <openhands@all-hands.dev>

vercel · 2026-05-16T21:31:42Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agent-canvas	Ready	Preview, Comment	May 16, 2026 9:45pm

all-hands-bot

Taste Rating: 🟡 Acceptable

[EVAL RISK]
This PR changes agent runtime behavior by disabling public skills loading by default. While the code implementation is clean and well-tested, this could affect benchmark/evaluation performance if evals assume public skills are available.

[RISK ASSESSMENT]
⚠️ Risk Level: 🟡 MEDIUM

Behavior change affecting agent capabilities (skill availability). Code quality is high, testing is adequate, but impact on evals and existing workflows is uncertain. While this fixes issue #211 (unexpected public-skill loading), the downstream impact needs validation.

Recommendation: Do not auto-merge. Run lightweight evals to verify this doesn't regress agent performance before merging.

VERDICT:
⏸️ Needs validation - Code quality is solid and implementation is correct, but behavior change should be validated against evals before merging.

KEY INSIGHT:
Simple config flip with potentially broad behavioral impact - validate against evals first.

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Eval risk concerns do not apply to this repository because we don't run benchmarks"). See the customization docs for the required frontmatter format.

Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

smolpaws · 2026-05-16T21:35:41Z

Nice cleanup — gating behind shouldLoadPublicSkills() instead of hardcoding is the right move.

One thought: should we default VITE_LOAD_PUBLIC_SKILLS to true rather than false? Right now this PR flips the default from "always on" to "off unless explicitly opted in", which means users who upgrade will silently lose public skills in conversations without knowing why. Defaulting to true keeps current behavior while still giving users the knob to turn it off.

smolpaws · 2026-05-16T21:42:22Z

After digging through the #proj-agent-canvas Slack discussion (May 11–15), I think the default here should be flipped to true:

Everyone wants skills loaded — multiple team members actively worked to fix skills not loading in conversations. PR fix(skills): populate slash menu and skills modal #362 was merged to make skills appear in the slash menu and conversation context.
The real problem was performance, not whether to load — public skills were fetching from the extensions repo on every conversation start, which was slow. Hiep fixed this with SDK-side caching (software-agent-sdk#3253, merged).
The open question is curation, not disabling — Rohit M asked (May 15): "do we want to cache all openhands extensions skills, or curate a few default skills?" The direction is toward a curated default set, not turning them off.
Nobody in the discussion said "don't load public skills by default" — the consensus was: load them, make it fast, eventually curate which ones.

Defaulting to false would silently remove skills from conversations for all users who upgrade — the opposite of what the team has been working toward. The gating behind shouldLoadPublicSkills() is a good change (making it configurable), but the default should stay true to match current behavior and team intent.

See also my comment on #211 — the issue as originally filed may be outdated.

Co-authored-by: openhands <openhands@all-hands.dev>

enyst · 2026-05-16T21:45:01Z

Addressed the feedback in 3be30b4: kept the new shouldLoadPublicSkills() gate, but flipped the default back to enabled so existing conversations retain current behavior and VITE_LOAD_PUBLIC_SKILLS=false remains the explicit opt-out.

This comment was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

Taste Rating: 🟢 Good taste - Clean, simple change with comprehensive test coverage

[EVAL RISK CORRECTION]

The previous review mischaracterized the direction of this change. This PR enables public skills loading by default, not disables it:

Old behavior: When VITE_LOAD_PUBLIC_SKILLS is unset, shouldLoadPublicSkills() returns false (skills disabled)
New behavior: When VITE_LOAD_PUBLIC_SKILLS is unset, shouldLoadPublicSkills() returns true (skills enabled)

This makes public skills available by default, giving agents more capabilities rather than fewer. The opt-out mechanism (VITE_LOAD_PUBLIC_SKILLS=false) allows deployments to explicitly disable them if needed.

[CODE QUALITY]

✅ Implementation is correct and elegant
✅ Test coverage is comprehensive (default/enabled/disabled cases plus conversation payload)
✅ Documentation updated correctly in AGENTS.md
✅ Skills page correctly maintains load_public=true regardless of conversation setting (verified in tests)

No code quality issues identified.

[RISK ASSESSMENT]

⚠️ Risk Level: 🟡 MEDIUM

This PR changes agent runtime behavior by making public skills available by default. While the code quality is excellent, the change is well-tested, and the direction is additive (enabling features, not removing them), this affects agent capabilities and could impact benchmark/evaluation performance.

Recommendation: Run lightweight evals to verify that enabling public skills by default doesn't introduce unexpected behavior or performance regressions in benchmarks before merging.

VERDICT:

✅ Code is merge-ready from a quality perspective, but needs eval verification due to agent capability changes.

KEY INSIGHT:

This fixes a likely unintentional default (skills disabled) rather than introducing a breaking change - enabling public skills by default makes the behavior match what most users would expect.

all-hands-bot

Taste Rating: 🟡 Acceptable - Clean implementation with comprehensive tests, but behavior change requires human validation

[BEHAVIOR CHANGE CLARIFICATION]

The PR description states "default behavior does not change" but this is not accurate from a code perspective:

Old behavior: When VITE_LOAD_PUBLIC_SKILLS is unset → shouldLoadPublicSkills() returns false (disabled)
New behavior: When VITE_LOAD_PUBLIC_SKILLS is unset → shouldLoadPublicSkills() returns true (enabled)

This is a real default flip from opt-in to opt-out. If the claim is that "production deployments explicitly set this env var, so production behavior is unchanged," that should be verified.

[RISK ASSESSMENT]
⚠️ Risk Level: 🟡 MEDIUM

Behavior change affecting agent capabilities: Public skills availability changes for any environment that doesn't explicitly set the env var
Code quality is high: Implementation is clean, minimal, and well-tested
Eval impact unclear: If evaluations/benchmarks rely on the default (unset env var), they will see different agent behavior (more skills available)
Recommendation: Verify that production deployments and evals explicitly set VITE_LOAD_PUBLIC_SKILLS rather than relying on the default, OR run lightweight evals to confirm no performance regression from the new default

VERDICT:
⚠️ Needs validation: Code is merge-ready from a quality perspective, but the behavior change should be validated by a human maintainer. If evals/production explicitly set the env var, this is low risk. If they rely on the default, this could change agent performance.

KEY INSIGHT:
The implementation correctly flips the default from "disabled unless enabled" to "enabled unless disabled" - whether that's the right strategic choice depends on deployment context that I cannot verify from code alone.

github-actions · 2026-05-16T22:06:39Z

📸 Snapshot Test Report

Warning

Snapshot comparison step crashed (timeout, OOM, or runner error) — diff results below may be incomplete or absent.
Check the CI logs for the full error output (look for the "Run snapshot comparison" step).

Warning

One or more snapshot tests crashed during generation — some snapshots below may be incomplete.
Check the CI logs for the full error output (look for the "Generate current PR snapshots" step).

❌ 22 snapshots differ from the main branch baselines. Add the update-snapshots label to acknowledge intentional changes.

Category	Count
🔴 Changed	22
🆕 New	1
✅ Unchanged	50
Total	73

How to resolve:

Unintentional diffs — the baselines on main may have moved since this branch was created. Merge the latest main into this branch and re-run CI.

Intentional changes — add the update-snapshots label. CI will pass and the new screenshots become the baseline when this PR merges.

🔴 Changed snapshots (22)