Self-repair Phase 3: contradiction detection on save (#347) by rockfordlhotka · Pull Request #373 · MarimerLLC/rockbot

rockfordlhotka · 2026-05-08T21:01:03Z

Closes #347.

Summary

Hot-path keyword detector resolves conflicting beliefs at memory-write time, narrowly scoped to claim/capability/* and feedback/* (other categories pay zero cost).
New MemoryEntry.SupersededBy marker hides superseded entries from SearchAsync/recall while keeping them on disk for audit; GetAsync still returns by id.
Dream service gains a per-cycle LLM-mediated contradiction sweep as a backstop for cases the keyword detector misses, with a guard that refuses to supersede a user-correction with a non-correction.

Acceptance criteria (issue #347)

Saving "calendar wrapper does pass arguments" supersedes the older "wrapper cannot pass arguments" claim — Phase3ContradictionEndToEndTests.Acceptance1_*.
Existing user-correction memories displace conflicting agent-self memories on the next dream sweep — DreamServiceContradictionSweepTests.Acceptance2_*.
Contradiction check is measurably narrow — no impact on saves outside claim/capability/* and feedback/*. Verified by MemoryContradictionDetectorTests.ResolveAsync_OutsideScopedCategories_ReturnsNoneWithoutScanningStore and Phase3ContradictionEndToEndTests.Acceptance3_*.

Resolution rules

Newer wins by default: older entries get SupersededBy = <new id>.
User correction always wins regardless of recency: the incoming entry is saved with SupersededBy = <existing id>. Identification: category prefix feedback/from-user/ or tag correction.

Commits

d425b9b — base implementation: detector, storage, writer/tools wiring, dream sweep, directive, 21 unit/end-to-end tests.
94a8159 — bypass LLM extraction on scoped categories. Real-world test surfaced that MemoryTools.SaveMemory's LLM extraction was rewriting feedback/from-agent/... to agent-knowledge/..., defeating the scoping rule. Scoped categories are now contracts: saves under feedback/* or claim/capability/* skip extraction and persist verbatim. Tool description updated. 4 new tests.
e019525 — stem plural-s and lower Jaccard threshold to 0.3. Real-world test showed "report" vs "reports" plus a trailing rationale clause yielded overlap of 0.33, below the original 0.4. Naive trailing-s stripping for tokens of length 4+ collapses singular/plural; threshold drop absorbs trailing clauses. Existing tests still pass; added a realistic-phrasing regression.
9401834 — relax the multi-candidate skip. Every entry in the candidate list already has the inverse valence of the incoming entry (filtered by the loop), so multi-match is not ambiguous — it just means the same rule was saved more than once before a reversal. The detector now supersedes them all. User-correction protection still trumps. Replaced the obsolete "ambiguous defers to sweep" test; added a regression test confirming user-correction protection in the multi-candidate path.

Production validation (deployed to staging K8s)

Five live confirmations against rockylhotka/rockbot-agent:0.10.40:

Hot-path supersession on opposite-valence feedback save: MemoryTools: marked <old> superseded by <new> ✅
Scoped-category bypass preserves caller's category verbatim: scoped direct save ... (feedback/from-agent/status-reports, ...) ✅
No false positive on unrelated rule in same category: Status reports should be timestamped saved with superseded=no while the existing TL;DR contradiction stayed undisturbed ✅
Dream contradiction sweep pass loaded directive at startup and ran on cadence (early-exit log on empty corpus): DreamService: contradiction sweep — only 0 claim/feedback entry/entries; skipping ✅
Singular/plural overlap fix verified end-to-end (the very phrasing that originally failed) ✅

Out of scope

General-purpose contradiction detection across all memory categories — explicit non-goal in the design.

Behaviour change to note

This PR turns on the contradiction sweep dream pass by default (DreamOptions.ContradictionSweepEnabled = true). On the next dream cycle, it will scan claim/capability/* and feedback/* and may mark entries superseded based on the LLM's judgment. User-correction protection is enforced post-LLM, so a user correction can never be superseded by an agent-self entry.

Test plan

dotnet test RockBot.slnx — 1,733 passed, 0 failed (4 RabbitMQ + 11 Python integration tests skipped as expected).
CI: both build-and-test workflows green.
Live K8s validation against the real MemoryTools.SaveMemory LLM-callable surface.

🤖 Generated with Claude Code

Resolves conflicting beliefs at memory-write time, narrowly scoped to capability claims (claim/capability/*) and feedback memories (feedback/*). Saves outside those subtrees pay zero detection cost. Hot path (deterministic, keyword-based): - Capability claims: same (server, tool), opposite valence by negation marker set ("cannot|does not|blocked|not supported|..."). - Feedback: same rule subject (category subtree leaf), opposite directive (negation dominates affirmative when both keywords appear), Jaccard token overlap >= 0.4. - User-tagged corrections (feedback/from-user/* or tag "correction") always win regardless of recency. - Ambiguous matches defer to the dream sweep. Storage: - New MemoryEntry.SupersededBy init-only string?, round-trips through JSON serialisation. - FileMemoryStore.SearchAsync hides superseded entries by default; MemorySearchCriteria.IncludeSuperseded opt-in for audit and the dream sweep itself. GetAsync still returns by id. Wiring: - CapabilityClaimWriter calls the detector before save and applies SupersededBy markers (or marks the incoming entry superseded when an existing user-correction wins). - MemoryTools.SaveMemoryBackgroundAsync calls the detector only when the resolved entry's category starts with "feedback/", enforcing the narrow-scope rule on the LLM-callable save path. Dream backstop: - New per-cycle contradiction sweep pass loads only claim/capability/* and feedback/* entries and asks the LLM (via agent/contradiction-sweep.md) to flag remaining pairs. Refuses to supersede a user-correction with a non-correction. Application logic is extracted to ApplyContradictionSweepResultAsync for direct unit testing. Tests: 21 new (detector unit, file store exclusion + round-trip, end-to-end acceptance for issue #347 criteria 1-3, dream sweep). Full suite: 1,728 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Real-world test of #347 surfaced that MemoryTools.SaveMemory's LLM extraction pass freely rewrites the caller-supplied category. A save with category=feedback/from-agent/status-reports landed under agent-knowledge/status-reports, so the contradiction detector — gated on category.StartsWith("feedback/") — never ran. Fix: when the caller supplies a scoped category (feedback/* or claim/capability/*), bypass extraction and persist the content verbatim under the supplied category and tags. Scoped categories are contracts, not hints. The contradiction detector still runs, so opposing-directive saves now correctly supersede prior entries. Tool description updated to surface the reserved-category contract to the LLM. Non-scoped saves keep the existing extraction behaviour unchanged (regression test added). Tests: 4 new MemoryToolsTests covering bypass, category preservation, supersession wiring, and the non-scoped regression. Full suite: 1,732 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Real-world test surfaced that "report" vs "reports" with a trailing clause ("they should be concise without one") yielded a Jaccard overlap of 0.33 — below the original 0.4 threshold — so the detector returned no contradiction even though the directives clearly oppose each other. Two narrow tightenings: - TokenizeNonStopwords strips a trailing 's' from tokens of length 4+ before hashing, collapsing report/reports, list/lists, etc. Naive but adequate for a rule-subject overlap signal; over-merging only improves recall on a comparison that is symmetric. - MinFeedbackOverlap drops from 0.4 to 0.3 to absorb trailing rationale clauses that dilute the union without contributing to intersection. Existing tests still pass — they used identical phrasings on either side so already had Jaccard well above either threshold. Added a realistic-phrasing regression test mirroring the deployed prompt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The "skip when more than one candidate matches" rule was overly defensive. Every entry in `contradicted` already shares the inverse valence of the incoming entry — that's how it got into the candidate list — so multiple matches are not ambiguous. They just mean the same affirmative rule was saved more than once before a single negative reversal arrived. Real-world test surfaced this: a leftover stale "Always include TL;DR" entry plus a fresh "Always include TL;DR" matched a single "Never include TL;DR" save. The detector deferred to the dream sweep instead of doing the obvious thing — supersede both prior entries. The change supersedes all candidates instead of skipping. The user- correction protection at the top of the resolver still trumps everything, so a correction among the candidates still wins. Genuine ambiguity (rule subjects that don't actually overlap) is filtered out earlier by the Jaccard + category-leaf gates. Tests: replaced the now-obsolete "ambiguous defers to sweep" test with a "all same-valence matches are superseded" test, and added a regression test confirming user-correction protection still wins when one of the multiple candidates is a correction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rockfordlhotka and others added 4 commits May 8, 2026 16:00

rockfordlhotka merged commit 2d316fd into main May 8, 2026
2 checks passed

rockfordlhotka deleted the self-repair-phase-3-contradictions branch May 8, 2026 23:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self-repair Phase 3: contradiction detection on save (#347)#373

Self-repair Phase 3: contradiction detection on save (#347)#373
rockfordlhotka merged 4 commits into
mainfrom
self-repair-phase-3-contradictions

rockfordlhotka commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rockfordlhotka commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Acceptance criteria (issue #347)

Resolution rules

Commits

Production validation (deployed to staging K8s)

Out of scope

Behaviour change to note

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rockfordlhotka commented May 8, 2026 •

edited

Loading