Skip to content

ci: pin codegen tools + guard generated-file freshness; document preserve-on-conflict#877

Open
dougborg wants to merge 3 commits into
mainfrom
docs/typed-cache-preserve-contract
Open

ci: pin codegen tools + guard generated-file freshness; document preserve-on-conflict#877
dougborg wants to merge 3 commits into
mainfrom
docs/typed-cache-preserve-contract

Conversation

@dougborg
Copy link
Copy Markdown
Owner

Closes #874. Two related follow-ups from the #870/#873/#876 service-parent_id thread: the regen-freshness gap that thread surfaced, and the docs for the durability contract it produced.

1. Generated-file freshness guard (#874)

Generated client/pydantic code is committed source, but the codegen tools were loose floors — so the resolved version could drift and silently change output with no signal. That's exactly what bit #873: datamodel-codegen 0.58.0 re-typed InventoryMovement.created_at/updated_at to bare Any, surfaced only because that PR happened to regenerate.

  • Pin both tools exactlydatamodel-code-generator==0.58.0, openapi-python-client==0.28.4. Bumping is now deliberate: edit the pin, uv run poe regenerate-all, commit the diff.
  • New generated-files CI job regenerates and git diff --exit-codes the generated paths (api/, models/, client.py, models_pydantic/_generated/, _auto_registry.py), failing if committed output is stale. Gated on code/spec changes, mirroring the quality job.
  • A fresh regenerate-all against current main is already clean (post-feat(mcp): denormalize service_id onto CachedVariant; drop search_items O(N) scan #873), so no baseline regeneration is bundled.

2. Docs: the preserve-on-conflict contract

The durable lesson from the service_id work: a cache-only column populated by a different entity's post_sync hook (not the owning entity's own wire payload) must be in the owning spec's preserve_columns_on_conflict, or _bulk_upsert's blanket ON CONFLICT DO UPDATE SET re-nulls it on the next delta.

  • typed_cache/README.md — new section stating the general contract (attrs_postprocess vs post_sync, the service_id example, the regression test that pins it).
  • cookbook/catalog-search.md — a service_id subsection alongside the four attrs_postprocess fields, distinguishing the cross-table case.
  • CLAUDE.md — adds the pattern to the typed-cache topical keywords for grep discoverability.

Test plan

  • uv run poe agent-check clean (ruff/format/mdformat/ty)
  • uv run poe test green (3780)
  • regenerate-all produces no drift (the new CI job's local equivalent passes)
  • yamllint + mdformat pass on the changed workflow/docs
  • CI: the new generated-files job runs green on this PR

🤖 Generated with Claude Code

dougborg added 2 commits May 29, 2026 16:58
Generated client/pydantic code is committed source, but the codegen tools were
loose floors (`datamodel-code-generator>=0.27.0`, `openapi-python-client>=0.25.2`),
so the resolved version could drift and silently change generated output with no
signal. That's exactly what bit #873: datamodel-codegen 0.58.0 re-typed
InventoryMovement's created_at/updated_at to bare `Any`, and it only surfaced
because that PR happened to regenerate.

- Pin both tools to exact current versions (==0.58.0 / ==0.28.4). Bumping is now
  a deliberate, reviewed change: edit the pin, run `uv run poe regenerate-all`,
  commit the diff.
- Add a `generated-files` CI job that regenerates and `git diff --exit-code`s the
  generated paths (api/, models/, client.py, models_pydantic/_generated/,
  _auto_registry.py), failing if the committed output is stale. Gated on
  code/spec changes, mirroring the quality job.

A fresh `regenerate-all` against current main is already clean, so no baseline
regeneration is bundled here.
Capture the durable lesson from the service_id work (#870/#873/#876): a
cache-only column populated by a *different* entity's `post_sync` hook (not the
owning entity's own wire payload) must be listed in the owning spec's
`preserve_columns_on_conflict`, or `_bulk_upsert`'s blanket
`ON CONFLICT DO UPDATE SET` re-nulls it on the next delta.

- typed_cache/README.md: new "Cache-only columns written cross-table need
  preserve-on-conflict" section — the general contract, attrs_postprocess vs
  post_sync, and the service_id example.
- cookbook/catalog-search.md: a `service_id` subsection alongside the four
  attrs_postprocess fields, distinguishing the cross-table case.
- CLAUDE.md: add the pattern to the typed-cache topical keywords for grep
  discoverability.
Copilot AI review requested due to automatic review settings May 29, 2026 23:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes generated client outputs more reproducible and documents the typed-cache durability contract that came out of the service parent_id work.

Changes:

  • Pins OpenAPI/client codegen tools to exact versions.
  • Adds a CI job that regenerates generated outputs and fails on drift.
  • Documents when cross-table cache-only columns need preserve_columns_on_conflict.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pyproject.toml Pins codegen dependencies and explains the regeneration workflow.
uv.lock Reflects the exact codegen dependency pins.
.github/workflows/ci.yml Adds the generated-file freshness CI job.
katana_mcp_server/docs/typed_cache/README.md Documents the cross-table cache-only column durability contract.
katana_mcp_server/docs/cookbook/catalog-search.md Adds service_id guidance for catalog search denormalization.
CLAUDE.md Adds discoverability keywords for the typed-cache pattern.

Comment thread .github/workflows/ci.yml Outdated
Comment on lines +231 to +236
if ! git diff --exit-code -- \
katana_public_api_client/api \
katana_public_api_client/models \
katana_public_api_client/client.py \
katana_public_api_client/models_pydantic/_generated \
katana_public_api_client/models_pydantic/_auto_registry.py; then
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in c8b4c71. Scoped the diff to the whole katana_public_api_client/ package instead of an enumerated list, so errors.py, client_types.py, py.typed, and the templated __init__.py are all covered. The generators don't rewrite the hand-maintained files in the package (katana_client.py, helpers/, domain/, …), so there are no spurious diffs. Verified a clean regenerate-all produces an empty package-wide diff locally.

Copilot review of #877: the enumerated path list missed files
regenerate_client.py also rewrites — errors.py, client_types.py (renamed from
types.py), py.typed, and the templated __init__.py — so a stale one could slip
through. Scope the diff to katana_public_api_client/ instead; the generators
don't touch the hand-maintained files in the package, so there are no spurious
diffs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci: guard against stale generated files (regen-freshness check)

2 participants