feat(memory): reconcile stale exact-lock updates by huangruiteng · Pull Request #2335 · volcengine/OpenViking

huangruiteng · 2026-05-30T22:45:07Z

Summary

This PR adds an opt-in exact file-lock apply path for concurrent OpenViking Memory V2 writes. The validated PR-1 scope is agent memory extraction for trajectories and experiences.

The goal is to let memory extraction run concurrently without relying on broad schema tree locks, while keeping write correctness at apply time:

lock the exact target file before apply
read the latest file under that lock
merge against the latest content using the read-time base value
rewrite stale string SEARCH/REPLACE patches when configured
synthesize stale string replacements when configured
fail fast, or record an explicit OCC skip, instead of silently producing a partial corpus

The new behavior is disabled by default behind:

memory.memory_apply_exact_file_lock_enabled
memory.stale_patch_rewrite_enabled

What Changed

Runtime memory apply:

Adds runtime base envelopes: StrPatchWithBase and ReplaceValueWithBase.
Applies memory upserts under exact file locks, then re-reads the latest target file before merge.
Rewrites stale StrPatch SEARCH/REPLACE patches against the latest field value.
Synthesizes stale string replace proposals against the latest field value.
Treats legacy plain-string patch in the agent exact path as a base-aware full replacement, not as a substring patch.
Records per-field apply_trace rows with URI, field, merge op, input shape, wrapper shape, stale/rewrite status, and applied/failed/skipped status.
Handles stale deletes and create-vs-existing races with explicit skip states instead of resurrecting or overwriting files.

Scope gates and validation plumbing:

Keeps exact apply scoped to agent trajectories / experiences.
Keeps standard long-term user memory, tools, skills, and session_skills on the existing tree-lock path.
Adds read-only memory graph health inspection for link/backlink consistency, source-link coverage, schema-heading checks, and lightweight source-density / near-duplicate diagnostics.
Extends TAU corpus/eval runner gates with commit concurrency, cache identity checks, concrete scoped search/read checks, graph health, retrieval trace coverage, completed scoreboard checks, and corpus provenance.

Scope

Included:

Same-URI patch/update/delete correctness under exact file locks.
Stale StrPatch rewrite for string SEARCH/REPLACE patches.
Stale string replacement synthesis.
Agent-only validation config for trajectories / experiences.
Deterministic transcript-local tool/skill SUM counter deltas as supporting cleanup.
Per-phase memory_diff_<phase>.json archives with apply traces.
Graph/schema/retrieval/eval gates needed to validate the write primitive.

Not included:

Exact apply for standard long-term user memory.
Exact apply for tool/skill memory or session_skills.
Non-string replacement synthesis.
Cross-URI create-new admission or semantic duplicate consolidation.
Session skill asset lifecycle validation.
Failure-experience retrieval or prompt changes.
A TAU reward uplift claim.

Validation

Current pushed head: 20ca50ed.

Local targeted validation:

OPENVIKING_CONFIG_FILE=/private/tmp/pr2335_pr1_pytest_ov.conf \
  .venv/bin/python -m pytest -q -o addopts='' -m 'not integration' \
  tests/session/memory/test_memory_updater.py \
  tests/session/memory/test_merge_ops.py \
  tests/session/memory/test_compressor_v2.py \
  tests/unit/session/test_compressor_v2_schema_gate.py \
  tests/unit/tool_skill/test_tool_skill_utils.py \
  tests/benchmark/tau2/test_run_memory_v2_eval_gates.py \
  tests/benchmark/tau2/test_analyze_memory_v2_corpus_quality.py \
  tests/client/test_rebuild_clients.py

Result:

166 passed, 1 deselected, 6 warnings
git diff --check: passed

Client regeneration check:

.venv/bin/python -m pytest -q -o addopts='' tests/client/test_rebuild_clients.py

Result:

16 passed, 5 warnings

GitHub checks on 20ca50ed:

check-deps: pass
lint / lint: pass
API & CLI Integration Tests (ubuntu-24.04): pass
pr_review: pass
build: skipped

Corpus Build Speed

The clean review-facing denominator is exact C4 vs safe serial tree C1. A same-concurrency tree C4 run was attempted but failed the strict write gate before producing valid manifests, so this PR does not claim exact C4 vs tree C4 speedup.

Domain	Exact C4 wall	Safe tree C1 wall	Exact / tree
retail	3595.885s	13096.090s	0.275x
airline	584.458s	1545.286s	0.378x
total	4189.586s	14642.055s	0.286x

Both valid runs used the same cached train results and produced valid retail + airline corpus manifests. The comparison is conservative: it measures the PR-1 concurrent exact path against the safe serial tree-lock baseline, not a lock-manager microbenchmark.

Full-Train Corpus Health

All four full-train corpora passed strict_corpus_gate with claim_valid=true.

Hard gates were clean in all four corpora:

source-linkless experiences: 0
broken endpoints: 0
missing backlinks: 0
missing forward links: 0
parse errors: 0
missing required headings: 0
scoped search/read probes: non-empty
session_skills_extracted_total: 0

Corpus shape:

Corpus	Sessions committed / skipped	Memories extracted	Experiences	Trajectories	Source avg	Single-source
exact retail C4	63 / 11	691	78	180	5.90	29.49%
tree retail C1	63 / 11	670	62	174	7.11	33.87%
exact airline C4	26 / 4	136	25	45	3.44	36.00%
tree airline C1	26 / 4	133	32	44	2.75	50.00%

Quality boundary:

Exact retail creates more experience cards than tree and has more name-similar pairs.
Tree tends to fold later sessions into broader cards; exact tends to preserve narrower sibling cards.
This is a semantic consolidation boundary for a follow-up admission/consolidation PR, not a graph/schema correctness failure in this PR.

Full Held-Out Eval Gate

Protocol:

retail + airline
full held-out test split
4 repeats
8/8 cells completed
240 simulations
exact and tree summaries had valid corpus/effect evidence
retrieval traces were non-zero and all injected reads were non-empty

Mode	Retail reward / DB	Airline reward / DB	Task-weighted reward	Task-weighted DB	Retrieval aggregate
exact	0.81875 / 0.81875	0.78750 / 0.81250	0.80833	0.81667	513 events / 1026 injected / 1026 non-empty reads
tree	0.84375 / 0.84375	0.75000 / 0.76250	0.81250	0.81667	482 events / 964 injected / 964 non-empty reads

Same-protocol exact vs tree:

task-weighted reward delta: -0.0042
task-weighted DB delta: 0.0

This supports no catastrophic corpus, retrieval, or scoreboard regression for the exact path. It does not claim reward uplift or semantic relevance for every retrieved memory.

Apply Trace / Stale Handling

Full-train archive traces:

Mode	Apply trace rows	Applied	Skipped stale-deleted	Stale detected / rewrite attempted
exact C4	2062	2056	6	87 / 87
tree C1	1992	1992	0	0 / 0

The exact path records stale and delete conflicts explicitly. These rows are not hidden as successful writes.

Claim Boundary

This PR is a common server-side write-safety primitive for agent memory. It validates exact file-lock apply, stale rewrite/synthesis, graph/schema health, retrieval plumbing, and non-catastrophic full-eval regression for agent trajectories / experiences.

It does not claim:

exact-file-lock coverage for user memory, tools, skills, or session_skills
cross-URI create-new duplicate resolution
semantic applicability parity
TAU reward uplift
same-concurrency tree C4 speedup
production default enablement

github-actions · 2026-05-30T22:46:49Z

PR Reviewer Guide 🔍

(Review updated until commit `20ca50e`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🏅 Score: 85
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: feat: exact file-lock write safety for memory updates Relevant files: tests/session/memory/test_memory_updater.py openviking/session/memory/memory_updater.py openviking/session/memory/merge_op/patch.py openviking/session/memory/merge_op/replace.py openviking/session/memory/merge_op/base.py Sub-PR theme: feat: memory graph health inspection endpoint Relevant files: openviking/session/memory/graph_health.py openviking/server/routers/stats.py tests/server/test_api_stats_memory_graph.py Sub-PR theme: feat: schema quality gates for corpus Relevant files: openviking/session/memory/schema_quality.py tests/unit/session/test_compressor_v2_schema_gate.py
⚡ No major issues detected

github-actions · 2026-05-30T22:47:57Z

PR Code Suggestions ✨

No code suggestions found for the PR.

…-patch-rewrite-pr1

huangruiteng · 2026-06-02T07:16:53Z

Landing update for PR-1 (2026-06-02): refreshed the PR body for head 217fba14, explicitly kept PR-2 admission/consolidation out of scope, and added the local targeted non-integration result (150 passed, 1 deselected). The earlier Qodo duplicate _text_digest note is addressed by 42da5956; the StructuredLLM resilience/cost topic remains bounded as follow-up telemetry in the PR body. A same-concurrency tree C4 wall-time rerun is still running separately, so this PR does not claim a finalized full-train wall-time ratio.

github-actions · 2026-06-02T07:17:57Z

Persistent review updated to latest commit 217fba1

github-actions · 2026-06-02T07:19:47Z

PR Code Suggestions ✨

No code suggestions found for the PR.

huangruiteng · 2026-06-02T07:21:46Z

Draft-state correction: I moved #2335 back to draft because review should include a rigorous speed + quality analysis, not just code/CI readiness. Current quality gates and full r4 non-regression evidence are already summarized in the PR body; the missing piece is the same-concurrency tree C4 wall-time baseline. A fresh tree C4 run is still in progress, so the PR should stay draft until the final speed table / claim boundary is updated.

…-patch-rewrite-pr1 # Conflicts: # tests/client/test_rebuild_clients.py

github-actions · 2026-06-02T12:22:54Z

Persistent review updated to latest commit 20ca50e

github-actions · 2026-06-02T12:25:36Z

PR Code Suggestions ✨

No code suggestions found for the PR.

huangruiteng · 2026-06-02T13:07:50Z

Follow-up to my earlier draft-state note: the final review-facing speed denominator and quality/health/eval evidence are now in the PR body, head 20ca50ed checks are green, and #2335 is marked ready for review. The scope remains PR-1 only; PR-2 admission/consolidation stays out of this upstream PR.

feat(memory): add exact-lock stale patch rewrite

bfde107

github-project-automation Bot added this to OpenViking project May 30, 2026

github-project-automation Bot moved this to Backlog in OpenViking project May 30, 2026

huangruiteng added 7 commits May 31, 2026 07:07

fix(memory): fail fast on exact-lock read and link errors

89518bb

style(memory): format exact-lock changes

7866b7f

feat(memory): add graph health validation hooks

6c61bcf

fix(memory): clean graph links on file delete

8153b78

feat(memory): gate exact apply by merge semantics

533cefc

style(memory): format exact-lock test

fa4f5e4

test(server): isolate embedded service config

8b3c04c

huangruiteng changed the title ~~feat(memory): add exact-lock stale patch rewrite~~ feat(memory): reconcile stale exact-lock updates May 31, 2026

huangruiteng added 16 commits June 1, 2026 00:10

feat(memory): synthesize stale string replacements

aac25bd

fix(memory): require read base for exact replacement

4c82f4a

fix(memory): harden exact agent apply boundary

7807e18

test(tau2): record corpus provenance in memory eval

7ce2c1f

test(memory): cover session skill extraction switch

26ca37b

feat(memory): write agent phase memory diffs

d37c0df

Merge remote-tracking branch 'origin/main' into feat/memory-file-lock…

7c57fee

…-patch-rewrite-pr1

style(session): sort imports after main merge

fb5e856

test(memory): cover stale rewrite apply traces

c66b5eb

test(memory): keep session skill fake phase signature current

eb35705

fix(memory): preserve subsecond trajectory timestamp keys

f1936b7

fix(memory): dedupe duplicate deletes in apply batch

fe114b2

fix(memory): skip stale exact upsert races

afea3b5

fix(memory): treat missing exact deletes as stale

9ef7e46

style(memory): format exact stale delete telemetry

9c22aeb

refactor(memory): share exact-lock text digest helper

42da595

huangruiteng mentioned this pull request Jun 2, 2026

feat(memory): add agent experience admission telemetry huangruiteng/OpenViking#1

Draft

fix(memory): record stale rewrite telemetry

217fba1

huangruiteng marked this pull request as ready for review June 2, 2026 07:16

huangruiteng marked this pull request as draft June 2, 2026 07:19

Merge remote-tracking branch 'origin/main' into feat/memory-file-lock…

20ca50e

…-patch-rewrite-pr1 # Conflicts: # tests/client/test_rebuild_clients.py

huangruiteng marked this pull request as ready for review June 2, 2026 12:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): reconcile stale exact-lock updates#2335

feat(memory): reconcile stale exact-lock updates#2335
huangruiteng wants to merge 26 commits into
volcengine:mainfrom
huangruiteng:feat/memory-file-lock-patch-rewrite-pr1

huangruiteng commented May 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

huangruiteng commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

huangruiteng commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

huangruiteng commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

huangruiteng commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Scope

Validation

Corpus Build Speed

Full-Train Corpus Health

Full Held-Out Eval Gate

Apply Trace / Stale Handling

Claim Boundary

Uh oh!

github-actions Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 20ca50e)

Uh oh!

github-actions Bot commented May 30, 2026

PR Code Suggestions ✨

Uh oh!

huangruiteng commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

PR Code Suggestions ✨

Uh oh!

huangruiteng commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

PR Code Suggestions ✨

Uh oh!

huangruiteng commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

huangruiteng commented May 30, 2026 •

edited

Loading

github-actions Bot commented May 30, 2026 •

edited

Loading

(Review updated until commit `20ca50e`)