Skip to content

fix: gate public skills in conversations#531

Closed
enyst wants to merge 2 commits into
mainfrom
openhands/fix-public-skills-211
Closed

fix: gate public skills in conversations#531
enyst wants to merge 2 commits into
mainfrom
openhands/fix-public-skills-211

Conversation

@enyst
Copy link
Copy Markdown
Member

@enyst enyst commented May 16, 2026

Summary

  • gate agent_context.load_public_skills behind shouldLoadPublicSkills() when starting conversations
  • keep public skills enabled by default so existing conversations retain current behavior
  • let deployers opt out with VITE_LOAD_PUBLIC_SKILLS=false, while the global Skills page still loads the public catalog
  • add regression coverage for the default-on and explicit-opt-out cases

Important: there is no eval risk because default behavior does not change, this only allows users to opt out of public skills if they want to.

Testing

  • npm run make-i18n
  • npx --yes vitest run tests/api/agent-server-config.test.ts tests/api/agent-server-adapter.test.ts tests/api/skills-service.test.ts
  • npm run typecheck

Refs #211

This PR was updated by an AI agent (OpenHands) on behalf of the user.

Co-authored-by: openhands <openhands@all-hands.dev>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agent-canvas Ready Ready Preview, Comment May 16, 2026 9:45pm

Request Review

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable

[EVAL RISK]
This PR changes agent runtime behavior by disabling public skills loading by default. While the code implementation is clean and well-tested, this could affect benchmark/evaluation performance if evals assume public skills are available.

[RISK ASSESSMENT]
⚠️ Risk Level: 🟡 MEDIUM

Behavior change affecting agent capabilities (skill availability). Code quality is high, testing is adequate, but impact on evals and existing workflows is uncertain. While this fixes issue #211 (unexpected public-skill loading), the downstream impact needs validation.

Recommendation: Do not auto-merge. Run lightweight evals to verify this doesn't regress agent performance before merging.

VERDICT:
⏸️ Needs validation - Code quality is solid and implementation is correct, but behavior change should be validated against evals before merging.

KEY INSIGHT:
Simple config flip with potentially broad behavioral impact - validate against evals first.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Eval risk concerns do not apply to this repository because we don't run benchmarks"). See the customization docs for the required frontmatter format.
  2. Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

@smolpaws
Copy link
Copy Markdown
Contributor

Nice cleanup — gating behind shouldLoadPublicSkills() instead of hardcoding is the right move.

One thought: should we default VITE_LOAD_PUBLIC_SKILLS to true rather than false? Right now this PR flips the default from "always on" to "off unless explicitly opted in", which means users who upgrade will silently lose public skills in conversations without knowing why. Defaulting to true keeps current behavior while still giving users the knob to turn it off.

@smolpaws
Copy link
Copy Markdown
Contributor

After digging through the #proj-agent-canvas Slack discussion (May 11–15), I think the default here should be flipped to true:

  1. Everyone wants skills loaded — multiple team members actively worked to fix skills not loading in conversations. PR fix(skills): populate slash menu and skills modal #362 was merged to make skills appear in the slash menu and conversation context.

  2. The real problem was performance, not whether to load — public skills were fetching from the extensions repo on every conversation start, which was slow. Hiep fixed this with SDK-side caching (software-agent-sdk#3253, merged).

  3. The open question is curation, not disabling — Rohit M asked (May 15): "do we want to cache all openhands extensions skills, or curate a few default skills?" The direction is toward a curated default set, not turning them off.

  4. Nobody in the discussion said "don't load public skills by default" — the consensus was: load them, make it fast, eventually curate which ones.

Defaulting to false would silently remove skills from conversations for all users who upgrade — the opposite of what the team has been working toward. The gating behind shouldLoadPublicSkills() is a good change (making it configurable), but the default should stay true to match current behavior and team intent.

See also my comment on #211 — the issue as originally filed may be outdated.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

enyst commented May 16, 2026

Addressed the feedback in 3be30b4: kept the new shouldLoadPublicSkills() gate, but flipped the default back to enabled so existing conversations retain current behavior and VITE_LOAD_PUBLIC_SKILLS=false remains the explicit opt-out.

This comment was created by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟢 Good taste - Clean, simple change with comprehensive test coverage

[EVAL RISK CORRECTION]

The previous review mischaracterized the direction of this change. This PR enables public skills loading by default, not disables it:

  • Old behavior: When VITE_LOAD_PUBLIC_SKILLS is unset, shouldLoadPublicSkills() returns false (skills disabled)
  • New behavior: When VITE_LOAD_PUBLIC_SKILLS is unset, shouldLoadPublicSkills() returns true (skills enabled)

This makes public skills available by default, giving agents more capabilities rather than fewer. The opt-out mechanism (VITE_LOAD_PUBLIC_SKILLS=false) allows deployments to explicitly disable them if needed.

[CODE QUALITY]

✅ Implementation is correct and elegant
✅ Test coverage is comprehensive (default/enabled/disabled cases plus conversation payload)
✅ Documentation updated correctly in AGENTS.md
✅ Skills page correctly maintains load_public=true regardless of conversation setting (verified in tests)

No code quality issues identified.

[RISK ASSESSMENT]

⚠️ Risk Level: 🟡 MEDIUM

This PR changes agent runtime behavior by making public skills available by default. While the code quality is excellent, the change is well-tested, and the direction is additive (enabling features, not removing them), this affects agent capabilities and could impact benchmark/evaluation performance.

Recommendation: Run lightweight evals to verify that enabling public skills by default doesn't introduce unexpected behavior or performance regressions in benchmarks before merging.

VERDICT:

Code is merge-ready from a quality perspective, but needs eval verification due to agent capability changes.

KEY INSIGHT:

This fixes a likely unintentional default (skills disabled) rather than introducing a breaking change - enabling public skills by default makes the behavior match what most users would expect.

@enyst enyst requested a review from all-hands-bot May 16, 2026 21:49
Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Clean implementation with comprehensive tests, but behavior change requires human validation

[BEHAVIOR CHANGE CLARIFICATION]

The PR description states "default behavior does not change" but this is not accurate from a code perspective:

  • Old behavior: When VITE_LOAD_PUBLIC_SKILLS is unset → shouldLoadPublicSkills() returns false (disabled)
  • New behavior: When VITE_LOAD_PUBLIC_SKILLS is unset → shouldLoadPublicSkills() returns true (enabled)

This is a real default flip from opt-in to opt-out. If the claim is that "production deployments explicitly set this env var, so production behavior is unchanged," that should be verified.

[RISK ASSESSMENT]
⚠️ Risk Level: 🟡 MEDIUM

  • Behavior change affecting agent capabilities: Public skills availability changes for any environment that doesn't explicitly set the env var
  • Code quality is high: Implementation is clean, minimal, and well-tested
  • Eval impact unclear: If evaluations/benchmarks rely on the default (unset env var), they will see different agent behavior (more skills available)
  • Recommendation: Verify that production deployments and evals explicitly set VITE_LOAD_PUBLIC_SKILLS rather than relying on the default, OR run lightweight evals to confirm no performance regression from the new default

VERDICT:
⚠️ Needs validation: Code is merge-ready from a quality perspective, but the behavior change should be validated by a human maintainer. If evals/production explicitly set the env var, this is low risk. If they rely on the default, this could change agent performance.

KEY INSIGHT:
The implementation correctly flips the default from "disabled unless enabled" to "enabled unless disabled" - whether that's the right strategic choice depends on deployment context that I cannot verify from code alone.

@enyst enyst closed this May 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

📸 Snapshot Test Report

Warning

Snapshot comparison step crashed (timeout, OOM, or runner error) — diff results below may be incomplete or absent.
Check the CI logs for the full error output (look for the "Run snapshot comparison" step).

Warning

One or more snapshot tests crashed during generation — some snapshots below may be incomplete.
Check the CI logs for the full error output (look for the "Generate current PR snapshots" step).

❌ 22 snapshots differ from the main branch baselines. Add the update-snapshots label to acknowledge intentional changes.

Category Count
🔴 Changed 22
🆕 New 1
✅ Unchanged 50
Total 73

How to resolve:

  • Unintentional diffs — the baselines on main may have moved since this branch was created. Merge the latest main into this branch and re-run CI.
  • Intentional changes — add the update-snapshots label. CI will pass and the new screenshots become the baseline when this PR merges.
🔴 Changed snapshots (22)

automations — 4 snapshots

automations-delete-modal

Expected (main) Actual (PR) Diff
expected actual diff

automations-list-active-inactive

Expected (main) Actual (PR) Diff
expected actual diff

automations-no-automations

Expected (main) Actual (PR) Diff
expected actual diff

automations-search-no-results

Expected (main) Actual (PR) Diff
expected actual diff

backends-extended — 4 snapshots

backend-add-blank-disabled

Expected (main) Actual (PR) Diff
expected actual diff

backend-dropdown-two-backends

Expected (main) Actual (PR) Diff
expected actual diff

backend-edit-prefilled

Expected (main) Actual (PR) Diff
expected actual diff

backend-manage-two-listed

Expected (main) Actual (PR) Diff
expected actual diff

backends — 3 snapshots

backend-add-modal

Expected (main) Actual (PR) Diff
expected actual diff

backend-manage-modal

Expected (main) Actual (PR) Diff
expected actual diff

backend-selector-open

Expected (main) Actual (PR) Diff
expected actual diff

mcp-page — 3 snapshots

mcp-custom-server-1-editor-open

Expected (main) Actual (PR) Diff
expected actual diff

mcp-custom-server-editor

Expected (main) Actual (PR) Diff
expected actual diff

mcp-slack-install-2-modal

Expected (main) Actual (PR) Diff
expected actual diff

settings-page — 4 snapshots

analytics-consent-modal

Expected (main) Actual (PR) Diff
expected actual diff

home-screen

Expected (main) Actual (PR) Diff
expected actual diff

settings-app-page

Expected (main) Actual (PR) Diff
expected actual diff

settings-page

Expected (main) Actual (PR) Diff
expected actual diff

settings-secrets

secrets-list

Expected (main) Actual (PR) Diff
expected actual diff

settings-verification — 3 snapshots

condenser-settings

Expected (main) Actual (PR) Diff
expected actual diff

verification-settings-off

Expected (main) Actual (PR) Diff
expected actual diff

verification-settings-on

Expected (main) Actual (PR) Diff
expected actual diff
🆕 New snapshots (1)

These snapshots have no baseline on main and will become the new baseline once this PR merges.

settings-page

add-backend-modal

new snapshot

✅ Unchanged snapshots (50)

backends-extended

  • backend-add-cloud-no-key-disabled
  • backend-add-cloud-with-key-enabled
  • backend-add-form-partially-filled
  • backend-add-invalid-url-disabled
  • backend-add-kind-cloud-inferred
  • backend-add-kind-local-inferred
  • backend-add-local-ready
  • backend-add-manual-override-local
  • backend-add-name-only-disabled
  • backend-add-oauth-button-disabled
  • backend-add-oauth-button-enabled
  • backend-add-whitespace-host-disabled
  • backend-after-switch
  • backend-cancel-nothing-saved
  • backend-manage-after-removal
  • backend-remove-cancelled
  • backend-remove-confirmation
  • backend-switch-overlay

changes-tab

  • changes-deleted-file
  • changes-diff-viewer
  • changes-empty

collapsible-thinking

  • reasoning-content-collapsed
  • reasoning-content-expanded
  • think-action-collapsed
  • think-action-expanded

docker-workspace-browser

  • docker-projects-workspace-browser

mcp-page

  • mcp-custom-server-2-url-filled
  • mcp-custom-server-3-all-filled
  • mcp-custom-server-4-installed
  • mcp-empty-installed
  • mcp-search-filtered
  • mcp-slack-install-1-marketplace
  • mcp-slack-install-3-filled
  • mcp-slack-install-4-installed

onboarding

  • onboarding-step-0-choose-agent
  • onboarding-step-1-check-backend
  • onboarding-step-2-setup-llm
  • onboarding-step-3-say-hello

settings-secrets

  • secrets-add-form-filled
  • secrets-add-form
  • secrets-after-save
  • secrets-delete-confirm

sidebar

  • sidebar-collapsed
  • sidebar-conversation-panel
  • sidebar-filter-menu

skills-page

  • skills-empty
  • skills-loaded
  • skills-no-match
  • skills-search-filtered
  • skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants