feat: Genkit AIProvider adapter with per-tool strict + startup cache warmup by Anthony-Bible · Pull Request #63 · Anthony-Bible/code-agent-demo

Anthony-Bible · 2026-05-06T12:44:34Z

Summary

Adds a parallel AIProvider implementation backed by Firebase Genkit Go's Anthropic plugin (selectable via AGENT_AI_PROVIDER=genkit; default remains the existing AnthropicAdapter).
Wires per-tool strict-schema validation through the domain. Following Anthropic's guidance ("Mark only critical tools as strict"), only bash, edit_file, task, and delegate opt in. Tools using JSON-schema keywords Anthropic rejects in strict mode (maxItems, minimum, etc.) leave strict off.
Primes Anthropic's compiled-grammar cache at startup. /ready returns 503 {\"status\":\"warming up\"} until the warmup completes so k8s rolling deploys don't route traffic to a cold pod.

Motivation

Strict-mode tool schemas trigger Anthropic's grammar compilation on first use (observed ~30-100s in our tests, then ~2-3s once the 24h regional cache is warm). Without amortization, every pod restart's first real webhook pays full cold-cache cost.

Empirical proof the cache works once tool order is deterministic:

Run	Time	State
1	73s	cold compile
2	99s	cold compile (different cache region)
3	2.6s	cache hit
4	3.1s	cache hit

Implementation notes

Genkit fork: Uses github.com/Anthony-Bible/genkit via replace directive — the fork adds WithStrict(bool) ToolOption and round-trips the bit through action metadata so LookupTool preserves it. Two upstream bugs fixed in the fork during this work: (1) strict:false was serialized explicitly when omitted; (2) LookupTool dropped the strict field.
Deterministic tool order: ListTools now sorts by name so the tool-array fingerprint stays stable across requests — required for Anthropic's cache key.
Warmup is best-effort: Failures (network, auth, etc.) still flip the readiness gate. A cold cache is a latency penalty, not a correctness problem.
Boot-time cancellation: Warmup uses context.Background() (not the cancellable cmd ctx) so a Ctrl+C during boot lets the in-flight request finish under its own 3-minute timeout rather than tearing it midway.

Test plan

`go build ./...` clean
`go test ./...` passes
E2E: `go test -tags=e2e ./internal/infrastructure/adapter/ai/ -run TestGenkitAdapter` passes (6 sub-tests including `AllToolsAtOnce`)
Manual: `AGENT_AI_PROVIDER=genkit ./code-agent-demo serve` — `/ready` returns 503 during warmup, then 200
Manual: kill warmup mid-flight (revoke API key) — `/ready` still flips to 200

🤖 Generated with Claude Code

claude · 2026-05-06T12:44:59Z

Claude finished @Anthony-Bible's task in 3m 41s —— View job

PR Review: feat: Genkit AIProvider adapter with per-tool strict + startup cache warmup

Gather context and understand the PR
Read changed files and analyze code
Review domain/port changes
Review Genkit adapter implementation
Review warmup implementation
Review webhook/readiness changes
Review config/container changes
Review tool executor changes
Post comprehensive review feedback

Summary

This is a well-motivated PR with solid architecture — the hexagonal design is respected, the readiness gating is clean, and the empirical cache-timing data is a nice touch. Several issues need attention before merging, ranging from a correctness bug to supply-chain hygiene.

Issues Found

Bug — Tool name collision in `registerTools` (genkit_adapter.go:345–400)

Two distinct tool original-names that sanitize to the same 64-char string (e.g., two long names sharing a common prefix) will silently register with the first tool's schema in the genkit registry while the toolNameMap points to the second tool's original name. The AI will call the second tool by name but the API will validate inputs against the first tool's schema. This is a silent correctness bug.

There's also a TOCTOU window on first registration: two concurrent goroutines can both read alreadyRegistered = false before either writes, then both call defineToolSafe. The second one gets a panic-recovered error returned to the caller. See inline comment at line 374.

Bug — Spurious content fallback during agentic tool-call turns (genkit_adapter.go:583)

When the model responds with only tool requests (the normal agentic loop), content is empty and the code injects "[No content received from AI]" into the conversation history. This string will be sent back to the model on the next turn, potentially confusing it. AnthropicAdapter presumably doesn't do this. See inline comment at line 583.

Design — `toolResultOutput` wraps results in a JSON map (genkit_adapter.go:519)

Tool results are sent to the model as {"result": "..."} / {"error": "..."} rather than raw strings. If the Genkit Anthropic plugin doesn't unwrap this before forwarding to the API, or if the model has learned to expect raw strings (as AnthropicAdapter provides), this could cause silent misinterpretation of tool results. See inline comment at line 519.

Design — Warmup runs unconditionally for all providers (serve.go:225)

runWarmup fires regardless of AGENT_AI_PROVIDER. When using AnthropicAdapter (the default) against a provider endpoint that doesn't support the full tool schema on first call, or in development/testing, this adds a 3-minute potential delay before /ready flips. Consider gating on a config flag (AGENT_WARMUP_ENABLED) or only enabling when AGENT_AI_PROVIDER=genkit. See inline comment.

Tech Debt — `getSystemPrompt` / prompt-builder duplication (genkit_adapter.go:640)

getSystemPrompt, buildPlanModePrompt, and buildBasePromptWithAgents are copy-pasted from AnthropicAdapter. The TODO acknowledges this, but the duplication is load-bearing: a prompt bug fixed in one adapter won't be fixed in the other, and behavior will silently diverge. The fix is a shared helper in the ai package — low-risk, high-return. See inline comment. Fix this →

Tech Debt — `ToolParamsFromExecutor` duplicates conversion logic (warmup.go:86)

Warmup correctness depends on the tool schema fingerprint being identical to what ConversationService.prepareAIRequest sends. Since the conversion is copy-pasted, any future field added to the conversion in ConversationService will silently break the warmup's cache-hit guarantee. A shared helper or exposing conversion through the service is safer. See inline comment.

Supply Chain — `replace` directive on a personal fork (go.mod)

The replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go pin means upstream security patches don't flow in automatically, and any upstream breaking change requires manual fork maintenance. Before merging, open or link upstream PRs for the WithStrict API and the two bug fixes, and set a concrete milestone for removing the directive. Consider go mod vendor in the interim so the forked code is audited and pinned in-tree. See inline comment.

Smaller Observations

delegate tool's allowed_tools array + Strict: true: Anthropic strict mode has been observed to reject schemas with certain array+items constructs. Worth explicit end-to-end testing — the test plan mentions AllToolsAtOnce but doesn't call out the delegate tool specifically.
sanitizeToolName length truncation: After truncating to 64 chars, two distinct long names with identical 64-char prefixes will silently collide. A truncated collision should at minimum log a warning.
ListTools() uses sort.Slice (unstable): Since tool names are unique, this is deterministic — no issue in practice. Could be sort.Slice or sort.SliceStable, outcome is the same.
genkit-compatibility.md committed to root: Consider moving to docs/ or keeping only in the PR description; committed design docs in the root tend to drift.
handleWebhookAsync already used context.Background() (pre-existing, not introduced here).

What Looks Good

The atomic.Bool pair (warmupRequired / warmupReady) for the readiness gate is clean and race-free.
defineToolSafe converting panics to errors is the right defensive move for a process-global registry.
initGenkit with panic recovery keeps the constructor signature honest.
Deterministic ListTools sort is exactly the right fix for the cache-key stability requirement.
Test coverage for message conversion, name sanitization, clone isolation, and streaming is solid.
The boot-time context.Background() rationale (let in-flight warmup finish under its own timeout rather than tearing it mid-request) is well-reasoned and documented.

claude · 2026-05-06T12:45:54Z

Claude finished @Anthony-Bible's task in 6m 8s —— View job

PR Review: feat/genkit-adapter-with-warmup

Gather context and understand the PR
Read domain/port changes (tool.go, ai_provider.go, conversation_service.go)
Review Genkit adapter (genkit_adapter.go)
Review warmup implementation (warmup.go)
Review webhook/readiness changes (http_adapter.go)
Review config/container changes (config.go, container.go)
Review tool executor changes
Review tests (genkit_adapter_test.go, genkit_adapter_e2e_test.go)
Post comprehensive review feedback with inline comments

Summary

Well-motivated PR with solid overall design. The architecture choices (Genkit as a side-by-side adapter, /ready gating, atomic warmup flag) are all sound. The domain model change (Strict bool on entity.Tool → port.ToolParam) is clean and the field flows correctly through ConversationService → ListTools → wire.

There is one critical bug (data race in Clone), one significant correctness issue (silent name-collision overwrite), and several smaller concerns. Inline comments have been posted for all of them.

Critical

Data race: Clone() shares the map but not the mutex (genkit_adapter.go:262–274, genkit_adapter.go:363–365)

nameMapMu is a value-type field. Clone() copies the struct but produces a fresh zero-value mutex for the clone while sharing the underlying toolNameMap. Every SendMessage call (even for already-registered tools) writes unconditionally to that shared map. Two concurrent subagent clones will race without mutual exclusion. Go's race detector will flag this on the first concurrent subagent spawn. Fix: make nameMapMu a *sync.RWMutex shared across clones, or replace the pair with a sync.Map.

+              				refs = append(refs, existing)
+              				continue
+              			}
+              		}

+              	}
+              	content := contentBuilder.String()
+              	if content == "" && len(thinkingBlocks) > 0 {

+              	}
+              }
+              // toolResultOutput packages a tool-call string result into a JSON-friendly

+              //
+              // TODO(refactor): extract to a shared helper consumed by both adapters.
+              // Duplicated for now to avoid touching anthropic_adapter.go in this PR.
+              func (a *GenkitAdapter) getSystemPrompt(ctx context.Context, opts port.AIRequestOptions) (string, error) {

+              	// in-flight warmup HTTP request finish cleanly under its own 3-minute
+              	// timeout rather than tearing it midway.
+              	webhookAdapter.SetWarmupRequired(true)
+              	go runWarmup(context.Background(), container, webhookAdapter, container.Logger())

+              // ToolExecutor and converts each entity.Tool to the port.ToolParam shape
+              // used on the wire. Mirrors the conversion in ConversationService so the
+              // schema fingerprint is identical to what real requests send.
+              func ToolParamsFromExecutor(exec port.ToolExecutor) ([]port.ToolParam, error) {

-              toolchain go1.24.11
+              go 1.25.0
               require (

+              		// registerTools, which is idempotent and synchronized. Sharing keeps
+              		// sanitized→original lookups consistent across clones.
+              		toolNameMap: a.toolNameMap,
+              	}

+              		a.nameMapMu.Lock()
+              		_, alreadyRegistered := a.toolNameMap[sanitized]
+              		a.toolNameMap[sanitized] = t.Name
+              		a.nameMapMu.Unlock()

+              	if a.subagentManager == nil {
+              		return basePrompt, nil
+              	}
+              	agents, err := a.subagentManager.ListAgents(ctx)

+              // Returns nil on success, or a wrapping error if the provider call fails.
+              // The caller is expected to log+continue rather than fail startup: a cold
+              // cache is a latency penalty, not a correctness problem.
+              func WarmCache(

+              	golang.org/x/text v0.35.0 // indirect
               )
+              replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0.20260506115940-43381ac9eb96

Conversation

Anthony-Bible commented May 6, 2026

Summary

Motivation

Implementation notes

Test plan

Uh oh!

claude Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat: Genkit AIProvider adapter with per-tool strict + startup cache warmup

Summary

Issues Found

Bug — Tool name collision in registerTools (genkit_adapter.go:345–400)

Bug — Spurious content fallback during agentic tool-call turns (genkit_adapter.go:583)

Design — toolResultOutput wraps results in a JSON map (genkit_adapter.go:519)

Design — Warmup runs unconditionally for all providers (serve.go:225)

Tech Debt — getSystemPrompt / prompt-builder duplication (genkit_adapter.go:640)

Tech Debt — ToolParamsFromExecutor duplicates conversion logic (warmup.go:86)

Supply Chain — replace directive on a personal fork (go.mod)

Smaller Observations

What Looks Good

Uh oh!

claude Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat/genkit-adapter-with-warmup

Summary

Critical

Significant

Minor

What's working well

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat/genkit-adapter-with-warmup

What Previous Reviews Got Fixed ✅

claude Bot commented May 6, 2026 •

edited

Loading

Bug — Tool name collision in `registerTools` (genkit_adapter.go:345–400)

Design — `toolResultOutput` wraps results in a JSON map (genkit_adapter.go:519)

Tech Debt — `getSystemPrompt` / prompt-builder duplication (genkit_adapter.go:640)

Tech Debt — `ToolParamsFromExecutor` duplicates conversion logic (warmup.go:86)

Supply Chain — `replace` directive on a personal fork (go.mod)

claude Bot commented May 6, 2026 •

edited

Loading

claude Bot commented May 6, 2026 •

edited

Loading

Bug — `Clone()` shares the map without sharing the mutex (`genkit_adapter.go:278–291`)

Bug — `[No content received from AI]` injected on normal tool-use turns (`genkit_adapter.go:607–609`)

Bug — `ListAgents` error fails the entire AI call (`system_prompt.go:87–89`)

Design — Warmup fires unconditionally for all providers (`serve.go:224–225`)

claude Bot commented May 6, 2026 •

edited

Loading

`ListAgents` error fails the entire AI call (`system_prompt.go:86–89`)

claude Bot commented May 6, 2026 •

edited

Loading

No tests for `warmup.go` (flagged all 5 rounds)

Silent tool name collision after truncation (`genkit_adapter.go:389`)

`replace` directive without upstream PR link (`go.mod:61`)

claude Bot commented May 6, 2026 •

edited

Loading

1. No tests for `warmup.go` (flagged 6 rounds in a row)

2. Silent tool name collision after sanitization (`genkit_adapter.go:389–393`)

3. `replace` directive has no upstream PR link (`go.mod:61`)

claude Bot commented May 12, 2026 •

edited

Loading

New Minor Issue (introduced by `b4b935d`)

1. No tests for `warmup.go` (flagged every round — 7 now)

2. Silent tool name collision after sanitization (`genkit_adapter.go:389–393`)

3. `replace` directive has no upstream PR link (`go.mod:61`)