Skip to content

Skill asset promotion: full 5-phase implementation (#370)#376

Open
rockfordlhotka wants to merge 7 commits intomainfrom
feature/skill-asset-promotion
Open

Skill asset promotion: full 5-phase implementation (#370)#376
rockfordlhotka wants to merge 7 commits intomainfrom
feature/skill-asset-promotion

Conversation

@rockfordlhotka
Copy link
Copy Markdown
Member

Closes #370.

Implements the full design from design/skill-asset-promotion.md (PR #369). All five phases land in this PR per the plan; commits are split atomically per phase.

Summary

The agent's self-improvement loop is no longer asymmetric. Subagents now capture the working artifacts they converge on (wisp definitions, scripts, schemas) as typed skill resources via a new promote_skill_asset tool. The dream cycle has a symmetric success-shaped pass that promotes repeating successful patterns autonomously. Provisional resources self-validate or self-evict based on observed use.

  • Phase 1WispExecutionRecord.DefinitionBody + IWispExecutionLog.GetCanonicalBodyAsync make a successful wisp's JSON recoverable for promotion. Bodies are size-capped at 8 KiB; failures and oversize runs record null with a diagnostic flag.
  • Phase 2aSkillResource gains Provisional, CreatedAt, VerifyHint, DefinitionHash. ISkillStore gains additive AttachResourceAsync / RemoveResourceAsync / UpdateResourceMetadataAsync so a single resource can be added or mutated without disturbing the rest of the manifest.
  • Phase 2bSkillTools.PromoteSkillAsset is gated behind a new enablePromote ctor flag, set only by SubagentRunner. The main agent does not see the tool. subagent-directives.md describes when to use it. FormatResourceTag shows [Wisp*] when any entry of that type is provisional.
  • Phase 3RunWispSuccessAnalysisPassAsync groups recent records by definition hash, keeps groups with frequency >= 3 && successRate == 1.0, resolves the invoking skill via ISkillUsageStore, fetches the canonical body via Phase 1, and asks the LLM to choose promotions. Promotions land non-provisional (observed-repetition validated). New wisp-success-dream.md directive plus a built-in fallback.
  • Phase 4a — Removes the dead-end promotionCandidates log loop on the failure pass. The success pass is now the single source of truth for promotions.
  • Phase 4bRepairTarget.SkillResource + SkillResourceApplier extend self-repair to mutate skill manifests as a first-class target. Three ops (attach, delete, demote-provisional), each with a revert callback so verify failures roll back cleanly.
  • Phase 5ISkillResourceUsageStore records resource-checkout events. RunProvisionalValidationPassAsync uses both wisp-record cross-reference (by DefinitionHash) and checkout events to flip provisional resources to validated after distinct-session success, remove them after consecutive failure (with a FailureClusterStore entry), and prefix [stale] to old unused ones. Decision logic is extracted as DreamService.DecideProvisionalAction for testability.

Test plan

  • dotnet build RockBot.slnx clean (no warnings on changed code)
  • dotnet test RockBot.slnx clean — 1808 tests pass, 0 fail (15 skipped: RabbitMQ + Scripts.Container integration)
  • Phase 1: WispExecutionLogTests — 16 tests, including new GetCanonicalBodyAsync round-trip and oversize-body handling
  • Phase 2a: FileSkillStoreTests — 57 tests, 8 new for AttachResourceAsync / RemoveResourceAsync / UpdateResourceMetadataAsync
  • Phase 2b: SkillToolsTests — 35 tests, 10 new for PromoteSkillAsset, enablePromote gating, and FormatResourceTag provisional marker
  • Phase 3: DreamServiceWispSuccessTests — 6 tests for ApplyWispSuccessPromotionsAsync decision boundaries
  • Phase 4b: SkillResourceApplierTests — 10 tests for attach / delete / demote-provisional + revert round-trips
  • Phase 5: ProvisionalValidationPassTests — 11 tests for DecideProvisionalAction thresholds + FileSkillResourceUsageStore round-trip
  • Live cluster smoke test (deploy + watch one dream cycle) — recommended post-merge per the build-deploy workflow in MEMORY.md; not blocking the PR

Notes

  • The BuiltInWispFailureDirective constant in DreamService.cs and the on-disk wisp-failure-dream.md both have promotionCandidates removed — Phase 4a — so they stay in sync.
  • The WispPromotionCandidateDto record is deleted (no consumers remain after Phase 4a).
  • FileSkillStore.SaveAsync(skill, resources) (the bulk-replace path) now preserves Provisional, VerifyHint, and stamps CreatedAt + DefinitionHash on every entry it writes. This means the existing patrol/wisp-mcp-params skill — currently the only one with a populated manifest — will gain CreatedAt + DefinitionHash the next time it's resaved, with no migration required.

🤖 Generated with Claude Code

rockfordlhotka and others added 7 commits May 9, 2026 01:10
Add DefinitionBody and BodyOmittedTooLarge to WispExecutionRecord so the
success-shaped dream pass can recover the JSON of a repeating successful
wisp for promotion to a skill resource. Bodies are retained only on
success and only when they fit an 8 KiB cap; failures and oversize runs
record null with the diagnostic flag.

Add IWispExecutionLog.GetCanonicalBodyAsync(definitionHash) returning
the earliest non-null body for a given hash so dedup-by-hash is cheap
and subsequent retries do not need to re-store identical bodies.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Async (#370)

Extend SkillResource with Provisional, CreatedAt, VerifyHint, and
DefinitionHash so the validation pass can distinguish trust levels,
preserve advisory exercise notes, and cross-reference resources against
the wisp execution log by content hash.

Add ISkillStore.AttachResourceAsync (additive single-resource add),
RemoveResourceAsync, and UpdateResourceMetadataAsync. The 2-arg SaveAsync
now preserves Provisional/VerifyHint from the input and stamps
CreatedAt/DefinitionHash on every entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add SkillTools.PromoteSkillAsset gated behind a new enablePromote ctor
flag — currently set only by SubagentRunner. The main agent does not see
the tool because it does not perform the exploratory discovery whose
result is worth capturing as a typed asset; it consumes assets the dream
pass has already promoted.

The tool computes a content hash matching the wisp execution log's
definition-hash scheme and writes the manifest entry as provisional with
CreatedAt and an optional VerifyHint, leaving validation to the
dream-cycle pass.

FormatResourceTag now appends a trailing * to types that have at least
one provisional entry, so the LLM sees the trust gradient at a glance
in the skill index.

The subagent directive describes when and how to call promote_skill_asset
as the symmetric complement to skill-prose tightening.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add RunWispSuccessAnalysisPassAsync as the symmetric complement to the
failure pass. It groups recent wisp records by definition hash, keeps
groups that succeeded repeatedly with no failures, resolves the invoking
skill via ISkillUsageStore, fetches the canonical body via Phase 1's
GetCanonicalBodyAsync, and asks the LLM to choose promotions.

Promotions land non-provisional via AttachResourceAsync because the
dream pass operates on observed repetition rather than a hypothesis —
the in-session promote_skill_asset path is the one that lands provisional.

The apply loop is extracted as an internal static helper so unit tests
can drive the attach logic without standing up the LLM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…370)

The promotionCandidates field on the failure-pass output was advisory-only
— receiver in DreamService just logged each entry without calling
SaveSkill. Now that Phase 3 ships a real success-shaped dream pass that
actually attaches resources, the dead end is removed: the field comes
out of the directive (file + built-in fallback), the DTO is deleted,
and the trailing log loop is gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…370)

Extend the self-repair RepairTarget enum with SkillResource so the
dream-cycle apply pass can mutate skill manifests as a first-class
target, and add SkillResourceApplier with three ops:

- attach: add a provisional resource (revert removes it, or restores
  the prior body if one existed at the same filename)
- delete: remove a resource (revert restores body + manifest entry)
- demote-provisional: flip Provisional=true on an existing entry,
  used when the dream pass loses confidence in a previously validated
  resource (revert restores Provisional=false)

Both attach and delete capture pre-state at apply time so the verify
phase can roll back without a second LLM round-trip. demote-provisional
is no-op when the entry is already provisional.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the dream-cycle pass that turns Phase-2 provisional resources into
validated ones, removes the broken ones, and de-emphasises the stale
ones — closing the trust gradient.

Decision rule (DecideProvisionalAction, internal-static for testing):
- Distinct-session successful wisp records sharing the resource's
  DefinitionHash >= 3 → flip Provisional=false, keep VerifyHint per
  user pref.
- Most-recent N wisp records all failures → remove resource and record
  a FailureClusterStore entry keyed (skill-resource, skillName, filename).
- Resource older than 30d with zero activity → prefix description
  with "[stale]" so the LLM stops loading it (body kept on disk).
- Otherwise → keep.

For non-wisp resources (Python, JsonSchema), distinct-session checkouts
recorded by the new ISkillResourceUsageStore (FileSkillResourceUsageStore
JSONL) substitute as a soft positive signal. SkillTools.GetSkillResource
fire-and-forgets a checkout event when the optional store is wired and
a sessionId is in scope; SubagentRunner now wires it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Skill asset promotion: capture working wisps as typed skill resources

1 participant