Self-repair Phase 3: contradiction detection on save (#347)#373
Merged
Conversation
Resolves conflicting beliefs at memory-write time, narrowly scoped to
capability claims (claim/capability/*) and feedback memories (feedback/*).
Saves outside those subtrees pay zero detection cost.
Hot path (deterministic, keyword-based):
- Capability claims: same (server, tool), opposite valence by negation
marker set ("cannot|does not|blocked|not supported|...").
- Feedback: same rule subject (category subtree leaf), opposite directive
(negation dominates affirmative when both keywords appear), Jaccard
token overlap >= 0.4.
- User-tagged corrections (feedback/from-user/* or tag "correction")
always win regardless of recency.
- Ambiguous matches defer to the dream sweep.
Storage:
- New MemoryEntry.SupersededBy init-only string?, round-trips through
JSON serialisation.
- FileMemoryStore.SearchAsync hides superseded entries by default;
MemorySearchCriteria.IncludeSuperseded opt-in for audit and the dream
sweep itself. GetAsync still returns by id.
Wiring:
- CapabilityClaimWriter calls the detector before save and applies
SupersededBy markers (or marks the incoming entry superseded when an
existing user-correction wins).
- MemoryTools.SaveMemoryBackgroundAsync calls the detector only when the
resolved entry's category starts with "feedback/", enforcing the
narrow-scope rule on the LLM-callable save path.
Dream backstop:
- New per-cycle contradiction sweep pass loads only claim/capability/*
and feedback/* entries and asks the LLM (via agent/contradiction-sweep.md)
to flag remaining pairs. Refuses to supersede a user-correction with a
non-correction. Application logic is extracted to
ApplyContradictionSweepResultAsync for direct unit testing.
Tests: 21 new (detector unit, file store exclusion + round-trip, end-to-end
acceptance for issue #347 criteria 1-3, dream sweep). Full suite: 1,728
passed, 0 failed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real-world test of #347 surfaced that MemoryTools.SaveMemory's LLM extraction pass freely rewrites the caller-supplied category. A save with category=feedback/from-agent/status-reports landed under agent-knowledge/status-reports, so the contradiction detector — gated on category.StartsWith("feedback/") — never ran. Fix: when the caller supplies a scoped category (feedback/* or claim/capability/*), bypass extraction and persist the content verbatim under the supplied category and tags. Scoped categories are contracts, not hints. The contradiction detector still runs, so opposing-directive saves now correctly supersede prior entries. Tool description updated to surface the reserved-category contract to the LLM. Non-scoped saves keep the existing extraction behaviour unchanged (regression test added). Tests: 4 new MemoryToolsTests covering bypass, category preservation, supersession wiring, and the non-scoped regression. Full suite: 1,732 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real-world test surfaced that "report" vs "reports" with a trailing
clause ("they should be concise without one") yielded a Jaccard overlap
of 0.33 — below the original 0.4 threshold — so the detector returned
no contradiction even though the directives clearly oppose each other.
Two narrow tightenings:
- TokenizeNonStopwords strips a trailing 's' from tokens of length 4+
before hashing, collapsing report/reports, list/lists, etc. Naive
but adequate for a rule-subject overlap signal; over-merging only
improves recall on a comparison that is symmetric.
- MinFeedbackOverlap drops from 0.4 to 0.3 to absorb trailing rationale
clauses that dilute the union without contributing to intersection.
Existing tests still pass — they used identical phrasings on either
side so already had Jaccard well above either threshold. Added a
realistic-phrasing regression test mirroring the deployed prompt.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "skip when more than one candidate matches" rule was overly defensive. Every entry in `contradicted` already shares the inverse valence of the incoming entry — that's how it got into the candidate list — so multiple matches are not ambiguous. They just mean the same affirmative rule was saved more than once before a single negative reversal arrived. Real-world test surfaced this: a leftover stale "Always include TL;DR" entry plus a fresh "Always include TL;DR" matched a single "Never include TL;DR" save. The detector deferred to the dream sweep instead of doing the obvious thing — supersede both prior entries. The change supersedes all candidates instead of skipping. The user- correction protection at the top of the resolver still trumps everything, so a correction among the candidates still wins. Genuine ambiguity (rule subjects that don't actually overlap) is filtered out earlier by the Jaccard + category-leaf gates. Tests: replaced the now-obsolete "ambiguous defers to sweep" test with a "all same-valence matches are superseded" test, and added a regression test confirming user-correction protection still wins when one of the multiple candidates is a correction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #347.
Summary
claim/capability/*andfeedback/*(other categories pay zero cost).MemoryEntry.SupersededBymarker hides superseded entries fromSearchAsync/recall while keeping them on disk for audit;GetAsyncstill returns by id.Acceptance criteria (issue #347)
Phase3ContradictionEndToEndTests.Acceptance1_*.DreamServiceContradictionSweepTests.Acceptance2_*.claim/capability/*andfeedback/*. Verified byMemoryContradictionDetectorTests.ResolveAsync_OutsideScopedCategories_ReturnsNoneWithoutScanningStoreandPhase3ContradictionEndToEndTests.Acceptance3_*.Resolution rules
SupersededBy = <new id>.SupersededBy = <existing id>. Identification: category prefixfeedback/from-user/or tagcorrection.Commits
d425b9b— base implementation: detector, storage, writer/tools wiring, dream sweep, directive, 21 unit/end-to-end tests.94a8159— bypass LLM extraction on scoped categories. Real-world test surfaced thatMemoryTools.SaveMemory's LLM extraction was rewritingfeedback/from-agent/...toagent-knowledge/..., defeating the scoping rule. Scoped categories are now contracts: saves underfeedback/*orclaim/capability/*skip extraction and persist verbatim. Tool description updated. 4 new tests.e019525— stem plural-s and lower Jaccard threshold to 0.3. Real-world test showed "report" vs "reports" plus a trailing rationale clause yielded overlap of 0.33, below the original 0.4. Naive trailing-sstripping for tokens of length 4+ collapses singular/plural; threshold drop absorbs trailing clauses. Existing tests still pass; added a realistic-phrasing regression.9401834— relax the multi-candidate skip. Every entry in the candidate list already has the inverse valence of the incoming entry (filtered by the loop), so multi-match is not ambiguous — it just means the same rule was saved more than once before a reversal. The detector now supersedes them all. User-correction protection still trumps. Replaced the obsolete "ambiguous defers to sweep" test; added a regression test confirming user-correction protection in the multi-candidate path.Production validation (deployed to staging K8s)
Five live confirmations against
rockylhotka/rockbot-agent:0.10.40:MemoryTools: marked <old> superseded by <new>✅scoped direct save ... (feedback/from-agent/status-reports, ...)✅Status reports should be timestampedsaved withsuperseded=nowhile the existing TL;DR contradiction stayed undisturbed ✅DreamService: contradiction sweep — only 0 claim/feedback entry/entries; skipping✅Out of scope
General-purpose contradiction detection across all memory categories — explicit non-goal in the design.
Behaviour change to note
This PR turns on the contradiction sweep dream pass by default (
DreamOptions.ContradictionSweepEnabled = true). On the next dream cycle, it will scanclaim/capability/*andfeedback/*and may mark entries superseded based on the LLM's judgment. User-correction protection is enforced post-LLM, so a user correction can never be superseded by an agent-self entry.Test plan
dotnet test RockBot.slnx— 1,733 passed, 0 failed (4 RabbitMQ + 11 Python integration tests skipped as expected).build-and-testworkflows green.MemoryTools.SaveMemoryLLM-callable surface.🤖 Generated with Claude Code