Skip to content

feat: Genkit AIProvider adapter with per-tool strict + startup cache warmup#63

Open
Anthony-Bible wants to merge 2 commits into
mainfrom
feat/genkit-adapter-with-warmup
Open

feat: Genkit AIProvider adapter with per-tool strict + startup cache warmup#63
Anthony-Bible wants to merge 2 commits into
mainfrom
feat/genkit-adapter-with-warmup

Conversation

@Anthony-Bible
Copy link
Copy Markdown
Owner

Summary

  • Adds a parallel AIProvider implementation backed by Firebase Genkit Go's Anthropic plugin (selectable via AGENT_AI_PROVIDER=genkit; default remains the existing AnthropicAdapter).
  • Wires per-tool strict-schema validation through the domain. Following Anthropic's guidance ("Mark only critical tools as strict"), only bash, edit_file, task, and delegate opt in. Tools using JSON-schema keywords Anthropic rejects in strict mode (maxItems, minimum, etc.) leave strict off.
  • Primes Anthropic's compiled-grammar cache at startup. /ready returns 503 {\"status\":\"warming up\"} until the warmup completes so k8s rolling deploys don't route traffic to a cold pod.

Motivation

Strict-mode tool schemas trigger Anthropic's grammar compilation on first use (observed ~30-100s in our tests, then ~2-3s once the 24h regional cache is warm). Without amortization, every pod restart's first real webhook pays full cold-cache cost.

Empirical proof the cache works once tool order is deterministic:

Run Time State
1 73s cold compile
2 99s cold compile (different cache region)
3 2.6s cache hit
4 3.1s cache hit

Implementation notes

  • Genkit fork: Uses github.com/Anthony-Bible/genkit via replace directive — the fork adds WithStrict(bool) ToolOption and round-trips the bit through action metadata so LookupTool preserves it. Two upstream bugs fixed in the fork during this work: (1) strict:false was serialized explicitly when omitted; (2) LookupTool dropped the strict field.
  • Deterministic tool order: ListTools now sorts by name so the tool-array fingerprint stays stable across requests — required for Anthropic's cache key.
  • Warmup is best-effort: Failures (network, auth, etc.) still flip the readiness gate. A cold cache is a latency penalty, not a correctness problem.
  • Boot-time cancellation: Warmup uses context.Background() (not the cancellable cmd ctx) so a Ctrl+C during boot lets the in-flight request finish under its own 3-minute timeout rather than tearing it midway.

Test plan

  • `go build ./...` clean
  • `go test ./...` passes
  • E2E: `go test -tags=e2e ./internal/infrastructure/adapter/ai/ -run TestGenkitAdapter` passes (6 sub-tests including `AllToolsAtOnce`)
  • Manual: `AGENT_AI_PROVIDER=genkit ./code-agent-demo serve` — `/ready` returns 503 during warmup, then 200
  • Manual: kill warmup mid-flight (revoke API key) — `/ready` still flips to 200

🤖 Generated with Claude Code

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 6, 2026

Claude finished @Anthony-Bible's task in 3m 41s —— View job


PR Review: feat: Genkit AIProvider adapter with per-tool strict + startup cache warmup

  • Gather context and understand the PR
  • Read changed files and analyze code
  • Review domain/port changes
  • Review Genkit adapter implementation
  • Review warmup implementation
  • Review webhook/readiness changes
  • Review config/container changes
  • Review tool executor changes
  • Post comprehensive review feedback

Summary

This is a well-motivated PR with solid architecture — the hexagonal design is respected, the readiness gating is clean, and the empirical cache-timing data is a nice touch. Several issues need attention before merging, ranging from a correctness bug to supply-chain hygiene.


Issues Found

Bug — Tool name collision in registerTools (genkit_adapter.go:345–400)

Two distinct tool original-names that sanitize to the same 64-char string (e.g., two long names sharing a common prefix) will silently register with the first tool's schema in the genkit registry while the toolNameMap points to the second tool's original name. The AI will call the second tool by name but the API will validate inputs against the first tool's schema. This is a silent correctness bug.

There's also a TOCTOU window on first registration: two concurrent goroutines can both read alreadyRegistered = false before either writes, then both call defineToolSafe. The second one gets a panic-recovered error returned to the caller. See inline comment at line 374.

Bug — Spurious content fallback during agentic tool-call turns (genkit_adapter.go:583)

When the model responds with only tool requests (the normal agentic loop), content is empty and the code injects "[No content received from AI]" into the conversation history. This string will be sent back to the model on the next turn, potentially confusing it. AnthropicAdapter presumably doesn't do this. See inline comment at line 583.

Design — toolResultOutput wraps results in a JSON map (genkit_adapter.go:519)

Tool results are sent to the model as {"result": "..."} / {"error": "..."} rather than raw strings. If the Genkit Anthropic plugin doesn't unwrap this before forwarding to the API, or if the model has learned to expect raw strings (as AnthropicAdapter provides), this could cause silent misinterpretation of tool results. See inline comment at line 519.

Design — Warmup runs unconditionally for all providers (serve.go:225)

runWarmup fires regardless of AGENT_AI_PROVIDER. When using AnthropicAdapter (the default) against a provider endpoint that doesn't support the full tool schema on first call, or in development/testing, this adds a 3-minute potential delay before /ready flips. Consider gating on a config flag (AGENT_WARMUP_ENABLED) or only enabling when AGENT_AI_PROVIDER=genkit. See inline comment.

Tech Debt — getSystemPrompt / prompt-builder duplication (genkit_adapter.go:640)

getSystemPrompt, buildPlanModePrompt, and buildBasePromptWithAgents are copy-pasted from AnthropicAdapter. The TODO acknowledges this, but the duplication is load-bearing: a prompt bug fixed in one adapter won't be fixed in the other, and behavior will silently diverge. The fix is a shared helper in the ai package — low-risk, high-return. See inline comment. Fix this →

Tech Debt — ToolParamsFromExecutor duplicates conversion logic (warmup.go:86)

Warmup correctness depends on the tool schema fingerprint being identical to what ConversationService.prepareAIRequest sends. Since the conversion is copy-pasted, any future field added to the conversion in ConversationService will silently break the warmup's cache-hit guarantee. A shared helper or exposing conversion through the service is safer. See inline comment.

Supply Chain — replace directive on a personal fork (go.mod)

The replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go pin means upstream security patches don't flow in automatically, and any upstream breaking change requires manual fork maintenance. Before merging, open or link upstream PRs for the WithStrict API and the two bug fixes, and set a concrete milestone for removing the directive. Consider go mod vendor in the interim so the forked code is audited and pinned in-tree. See inline comment.


Smaller Observations

  • delegate tool's allowed_tools array + Strict: true: Anthropic strict mode has been observed to reject schemas with certain array+items constructs. Worth explicit end-to-end testing — the test plan mentions AllToolsAtOnce but doesn't call out the delegate tool specifically.
  • sanitizeToolName length truncation: After truncating to 64 chars, two distinct long names with identical 64-char prefixes will silently collide. A truncated collision should at minimum log a warning.
  • ListTools() uses sort.Slice (unstable): Since tool names are unique, this is deterministic — no issue in practice. Could be sort.Slice or sort.SliceStable, outcome is the same.
  • genkit-compatibility.md committed to root: Consider moving to docs/ or keeping only in the PR description; committed design docs in the root tend to drift.
  • handleWebhookAsync already used context.Background() (pre-existing, not introduced here).

What Looks Good

  • The atomic.Bool pair (warmupRequired / warmupReady) for the readiness gate is clean and race-free.
  • defineToolSafe converting panics to errors is the right defensive move for a process-global registry.
  • initGenkit with panic recovery keeps the constructor signature honest.
  • Deterministic ListTools sort is exactly the right fix for the cache-key stability requirement.
  • Test coverage for message conversion, name sanitization, clone isolation, and streaming is solid.
  • The boot-time context.Background() rationale (let in-flight warmup finish under its own timeout rather than tearing it mid-request) is well-reasoned and documented.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 6, 2026

Claude finished @Anthony-Bible's task in 6m 8s —— View job


PR Review: feat/genkit-adapter-with-warmup

  • Gather context and understand the PR
  • Read domain/port changes (tool.go, ai_provider.go, conversation_service.go)
  • Review Genkit adapter (genkit_adapter.go)
  • Review warmup implementation (warmup.go)
  • Review webhook/readiness changes (http_adapter.go)
  • Review config/container changes (config.go, container.go)
  • Review tool executor changes
  • Review tests (genkit_adapter_test.go, genkit_adapter_e2e_test.go)
  • Post comprehensive review feedback with inline comments

Summary

Well-motivated PR with solid overall design. The architecture choices (Genkit as a side-by-side adapter, /ready gating, atomic warmup flag) are all sound. The domain model change (Strict bool on entity.Toolport.ToolParam) is clean and the field flows correctly through ConversationServiceListTools → wire.

There is one critical bug (data race in Clone), one significant correctness issue (silent name-collision overwrite), and several smaller concerns. Inline comments have been posted for all of them.


Critical

Data race: Clone() shares the map but not the mutex (genkit_adapter.go:262–274, genkit_adapter.go:363–365)

nameMapMu is a value-type field. Clone() copies the struct but produces a fresh zero-value mutex for the clone while sharing the underlying toolNameMap. Every SendMessage call (even for already-registered tools) writes unconditionally to that shared map. Two concurrent subagent clones will race without mutual exclusion. Go's race detector will flag this on the first concurrent subagent spawn. Fix: make nameMapMu a *sync.RWMutex shared across clones, or replace the pair with a sync.Map.

Fix this →


Significant

Silent name-collision overwrite in toolNameMap (genkit_adapter.go:362–365)

When two tools have different original names that sanitize to the same key (e.g., "weird.tool" and "weird_tool""weird_tool"), the second write silently overwrites the first entry. Subsequent reverse-lookups for the first tool return the second tool's name, causing wrong tool names in AI responses. Add a collision warning log before the write.

buildBasePromptWithAgents error is fatal to the AI call (genkit_adapter.go:695–707)

An error from subagentManager.ListAgents() propagates up and fails SendMessage. Should degrade gracefully — log and fall back to basePrompt — rather than failing the user's request over an optional enrichment step.

ToolParamsFromExecutor duplicates ConversationService conversion (warmup.go:86–104)

Both sites manually replicate entity.Tool → port.ToolParam. If ConversationService grows a new field (e.g., CacheControl), the warmup fingerprint silently diverges. Extract to a shared port.EntityToolToParam helper used by both paths.


Minor

  • TODO without an issue link (genkit_adapter.go:637–639): The shared-helper refactor is the right fix; track it in an issue so it doesn't stay that way indefinitely.
  • No unit tests for warmup.go: WarmCache (nil provider, zero tools, failing provider) and ToolParamsFromExecutor (nil executor, Strict propagation) are straightforward to test with mock ports. The Strict field propagation is the core correctness claim of this feature.
  • Warmup goroutine vs. k8s grace period (serve.go:225): context.Background() is correct for dev Ctrl+C (lets the request finish), but SIGKILL in k8s fires after terminationGracePeriodSeconds (typically 30s), well before WarmCacheTimeout (3 min). The goroutine is killed anyway. Document this or bump the k8s grace period if clean completion is required.
  • Fork replace directive (go.mod:61): Pinning to a personal fork with a commit hash is pragmatic. Add an upstream PR reference in a comment so the fork's lifecycle is trackable, and verify Dependabot covers it.

What's working well

  • atomic.Bool for warmupRequired / warmupReady is the right primitive — no lock needed for the common path.
  • defineToolSafe wrapping Genkit's panic-heavy DefineTool in a recover is a good defensive pattern.
  • sanitizeToolName is correct (all replacements produce ASCII-only output, so the byte-slice truncation at 64 is safe).
  • sort.Slice in ListTools for deterministic tool order is exactly right and well-placed.
  • The Strict field flows cleanly from entity.Toolport.ToolParam → wire, with no leakage into non-supporting providers (AnthropicAdapter just ignores it implicitly... worth confirming it serializes omitempty).
  • Test coverage for the unit-testable paths (sanitizeToolName, convertResponse, Clone, qualifiedModelName, message conversion) is comprehensive and parallel.

refs = append(refs, existing)
continue
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: TOCTOU race on tool name collision

There's a subtle race where two distinct tool names that sanitize to the same 64-char prefix (or the same already-invalid characters) will silently collide:

  1. Tool foo_bar_baz and foo_bar_bax both sanitize to identical name → second overrides toolNameMap[sanitized] to its original, but alreadyRegistered = true so it reuses the first tool's schema from the genkit registry.

Additionally, there's a TOCTOU window for the first registration: two concurrent goroutines can both see alreadyRegistered = false before either writes the map entry, then both call defineToolSafe. The second caller gets a panic-recovered error propagated to the user, rather than gracefully reusing the already-registered tool.

A safer pattern would be to check the genkit registry directly (before the map) and treat any successful LookupTool as authoritative, or to use a sync.Once per tool name:

// Instead of the current lock-check-unlock-lookuptool pattern:
a.nameMapMu.Lock()
_, alreadyRegistered := a.toolNameMap[sanitized]
a.toolNameMap[sanitized] = t.Name
a.nameMapMu.Unlock()

if alreadyRegistered {
    if existing := genkit.LookupTool(a.g, sanitized); existing != nil {
        refs = append(refs, existing)
        continue
    }
    // fall through to re-register if LookupTool returned nil
}

Consider also logging a warning when two different original names collapse to the same sanitized name, rather than silently using the wrong schema.

}

content := contentBuilder.String()
if content == "" && len(thinkingBlocks) > 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spurious content fallback when AI returns only tool calls

When the model responds with only tool requests and no text (the common agentic loop case), content is empty and falls through to:

content = "[No content received from AI]"

This string ends up in the conversation history as an assistant message, which could confuse the model on subsequent turns or appear in debug logs as an error signal. The AnthropicAdapter presumably returns an empty string or a different placeholder in this case.

Consider allowing empty content when len(entityToolCalls) > 0, and only using the fallback when truly nothing was in the response:

if content == "" && len(thinkingBlocks) > 0 {
    content = "[AI internal reasoning completed]"
}
// Don't inject a fake content string when the response contains tool calls —
// that's a normal agentic turn with no text.
if content == "" && len(entityToolCalls) == 0 {
    content = "[No content received from AI]"
}

}
}

// toolResultOutput packages a tool-call string result into a JSON-friendly
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tool result wrapping may differ from AnthropicAdapter's wire format

Results are wrapped in {"result": "..."} / {"error": "..."}, but the AnthropicAdapter presumably sends the raw string content back as a tool result. If the model has been trained or prompted to expect raw strings, wrapping them in a JSON object map could cause unexpected parsing behavior or confuse the AI's tool-result interpretation.

It's worth verifying that the Genkit Anthropic plugin correctly unwraps this before sending to the API, or that the model handles the wrapped form the same way.

//
// TODO(refactor): extract to a shared helper consumed by both adapters.
// Duplicated for now to avoid touching anthropic_adapter.go in this PR.
func (a *GenkitAdapter) getSystemPrompt(ctx context.Context, opts port.AIRequestOptions) (string, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated system prompt logic

The TODO comment here is correct — getSystemPrompt, buildPlanModePrompt, and buildBasePromptWithAgents are all duplicated from AnthropicAdapter. Beyond the maintenance overhead of keeping these in sync, there's a behavioral risk: if a bug is fixed in one adapter's prompt logic, the other silently diverges.

Consider extracting a shared systemPromptBuilder helper in the ai package (not in domain — it's adapter-specific), so both adapters call the same code. This is low-risk refactor with high return.

Fix this →

Comment thread cmd/cli/cmd/serve.go Outdated
// in-flight warmup HTTP request finish cleanly under its own 3-minute
// timeout rather than tearing it midway.
webhookAdapter.SetWarmupRequired(true)
go runWarmup(context.Background(), container, webhookAdapter, container.Logger())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warmup runs unconditionally for all providers

runWarmup always fires regardless of which AGENT_AI_PROVIDER is selected. If AnthropicAdapter is in use (the default), or if a user is pointing ANTHROPIC_BASE_URL at a local stub that doesn't support all the tool schemas, the warmup call will still block /ready for up to 3 minutes on failure.

Also: context.Background() is intentionally used to survive Ctrl+C (well documented in the PR), but it means if the user kills the process quickly, the goroutine outlives the server and calls MarkWarm() on a now-orphaned struct. This is harmless (setting an atomic bool), but slightly surprising.

Consider gating the warmup on a config flag (e.g., AGENT_WARMUP_ENABLED=true defaulting to false), or only enabling it when AGENT_AI_PROVIDER=genkit. This would prevent unnecessary startup delays in development or testing.

// ToolExecutor and converts each entity.Tool to the port.ToolParam shape
// used on the wire. Mirrors the conversion in ConversationService so the
// schema fingerprint is identical to what real requests send.
func ToolParamsFromExecutor(exec port.ToolExecutor) ([]port.ToolParam, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated conversion logic

The comment says this mirrors ConversationService's conversion. This is a maintenance hazard: if ConversationService.prepareAIRequest adds new fields to the ToolParam conversion (e.g., a future InputSchemaVersion field), ToolParamsFromExecutor will silently produce a different schema fingerprint, breaking the warmup's cache-hit guarantee.

Since the PR's warmup correctness depends on the fingerprint being identical to what real requests send, this duplication is load-bearing. A shared conversion helper (or exposing the conversion through the service itself) would be safer.

Comment thread go.mod
toolchain go1.24.11
go 1.25.0

require (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forked dependency via replace directive

The replace directive pins to github.com/Anthony-Bible/genkit (a personal fork) because upstream firebase/genkit doesn't expose WithStrict(bool) per-tool. While the PR explains the two bugs fixed in the fork, this creates a supply chain dependency that:

  1. Won't receive upstream security patches automatically
  2. May become stale if the fork isn't updated when genkit ships breaking changes
  3. Blocks upgrading other dependencies that transitively depend on genkit's stable API

The PR description mentions upstreaming these fixes. Before merging, it's worth opening or tracking upstream PRs, and setting a concrete timeline for removing the replace directive once they're accepted. If upstreaming is blocked, consider vendoring (go mod vendor) so the fork is audited and version-pinned in the repo itself.

// registerTools, which is idempotent and synchronized. Sharing keeps
// sanitized→original lookups consistent across clones.
toolNameMap: a.toolNameMap,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data race: shared map, independent mutexes

toolNameMap is shared by reference, but nameMapMu is a value field — the clone gets a fresh zero-value sync.RWMutex, not a copy of the original's. Both the original and every clone will call registerTools (which always writes to the shared map even on repeated calls — see line 364), and they'll be guarded by different mutexes. The Go race detector will flag this.

Fix: use a pointer mutex shared across clones, or switch to sync.Map:

type GenkitAdapter struct {
    // ...
    nameMapMu   *sync.RWMutex    // shared across clones
    toolNameMap map[string]string
}

and in Clone():

return &GenkitAdapter{
    // ...
    nameMapMu:   a.nameMapMu,   // share the same mutex
    toolNameMap: a.toolNameMap,
}

Or, simpler: replace nameMapMu + toolNameMap with a sync.Map — it's designed for exactly this "written once, read many" access pattern.

a.nameMapMu.Lock()
_, alreadyRegistered := a.toolNameMap[sanitized]
a.toolNameMap[sanitized] = t.Name
a.nameMapMu.Unlock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always writes to toolNameMap, even when alreadyRegistered

Even on repeated calls (subsequent requests, clone calls), this unconditionally writes a.toolNameMap[sanitized] = t.Name. Combined with the mutex-sharing issue in Clone(), this is the write that causes the race.

There's also a silent name-collision bug here: if two tools have different original names but sanitize to the same key (e.g., "weird.tool" and "weird_tool" both become "weird_tool"), the second write silently overwrites the first mapping. The first tool's calls will then be mapped back to the second tool's name. Consider adding a collision check:

if existing, ok := a.toolNameMap[sanitized]; ok && existing != t.Name {
    a.logger.Warn("tool name collision after sanitization — two tools share the same sanitized name",
        "sanitized", sanitized, "first", existing, "second", t.Name)
}
a.toolNameMap[sanitized] = t.Name

if a.subagentManager == nil {
return basePrompt, nil
}
agents, err := a.subagentManager.ListAgents(ctx)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agent listing error fails the entire AI request

If ListAgents fails (transient network error, missing directory, etc.), the error propagates all the way up through getSystemPrompt → buildGenerateOptions → SendMessage, causing the user's AI call to fail with "failed to list agents." This is surprising behavior.

Consider degrading gracefully instead, matching the pattern used elsewhere:

agents, err := a.subagentManager.ListAgents(ctx)
if err != nil {
    a.logger.Warn("failed to list agents for system prompt; proceeding without agent list", "error", err)
    return basePrompt, nil
}

The AnthropicAdapter almost certainly has the same pattern — worth checking for consistency. (Related: the TODO at line 637 calls this out as needing a shared helper.)

// ToolExecutor and converts each entity.Tool to the port.ToolParam shape
// used on the wire. Mirrors the conversion in ConversationService so the
// schema fingerprint is identical to what real requests send.
func ToolParamsFromExecutor(exec port.ToolExecutor) ([]port.ToolParam, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated conversion logic — fragile invariant

ToolParamsFromExecutor manually replicates the entity.Tool → port.ToolParam conversion that ConversationService.PrepareConversationRequest does at line 310-318 of conversation_service.go. The comment says "Mirrors the conversion… so the schema fingerprint is identical", but there's nothing enforcing this at compile time.

If someone adds a new field to ToolParam (e.g., CacheControl) and wires it in ConversationService, the warmup fingerprint silently diverges — defeating the purpose of the warmup.

Consider extracting the conversion to a shared helper in the port package, e.g.:

// EntityToolToParam converts a domain tool to its wire representation.
// Used by both ConversationService and the warmup path to guarantee
// the same schema fingerprint on every request.
func EntityToolToParam(t entity.Tool) ToolParam {
    return ToolParam{
        Name:        t.Name,
        Description: t.Description,
        InputSchema: t.InputSchema,
        Strict:      t.Strict,
    }
}

Both sites can then call port.EntityToolToParam(t) and stay in sync automatically.

// getSystemPrompt mirrors AnthropicAdapter.getSystemPrompt's priority order.
//
// TODO(refactor): extract to a shared helper consumed by both adapters.
// Duplicated for now to avoid touching anthropic_adapter.go in this PR.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO left in production code

This is the right call for scope-bounding the PR, but leaving a TODO in production code creates drift risk — the duplication isn't obvious from the diff. Recommend opening a follow-up issue and linking it from the comment so it gets tracked:

// TODO(#NNN): extract to a shared helper consumed by both adapters.
// Duplicated for now to avoid touching anthropic_adapter.go in this PR.

Otherwise "avoid touching X in this PR" tends to stay that way indefinitely.

// Returns nil on success, or a wrapping error if the provider call fails.
// The caller is expected to log+continue rather than fail startup: a cold
// cache is a latency penalty, not a correctness problem.
func WarmCache(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unit tests for WarmCache or ToolParamsFromExecutor

warmup.go is the only new file without test coverage. Both functions are straightforward to test with a mock AIProvider and ToolExecutor:

  • WarmCache with nil provider → error
  • WarmCache with zero tools → skips call
  • WarmCache with a failing provider → wraps error
  • ToolParamsFromExecutor with nil executor → error
  • ToolParamsFromExecutor with a real executor → converts Strict field correctly

The Strict field propagation is especially worth a test since it's the core correctness property of this feature — a regression there would be silent.

Comment thread cmd/cli/cmd/serve.go Outdated
// in-flight warmup HTTP request finish cleanly under its own 3-minute
// timeout rather than tearing it midway.
webhookAdapter.SetWarmupRequired(true)
go runWarmup(context.Background(), container, webhookAdapter, container.Logger())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warmup goroutine lifetime vs. k8s termination grace period

The warmup goroutine uses context.Background() intentionally (documented in the PR), so a Ctrl+C during boot doesn't abort the in-flight Anthropic request mid-stream. That's a good design for dev/manual use.

In k8s, the sequence is: SIGTERM → terminationGracePeriodSeconds (typically 30s) → SIGKILL. If warmup is still running when SIGKILL fires (which it will be if the grammar compilation takes 30–100s), the goroutine is killed mid-request anyway. The practical effect is the same as if you'd used the cancellable context — but the rationale ("let the request finish cleanly") doesn't hold under SIGKILL.

Worth documenting this nuance in the comment, and potentially setting terminationGracePeriodSeconds in the k8s manifests to something > WarmCacheTimeout (3 minutes) if clean warmup completion is actually desired during rolling deploys.

Comment thread go.mod Outdated
golang.org/x/text v0.35.0 // indirect
)

replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0.20260506115940-43381ac9eb96
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fork via replace directive — build supply-chain risk

Pinning to github.com/Anthony-Bible/genkit via a replace directive is a pragmatic workaround while upstream review is pending, but it creates ongoing risk:

  1. Build stability: if the fork branch is force-pushed or the repo is renamed/deleted, go mod download will break for all contributors.
  2. Security: dependencies of a personal fork aren't scanned by Dependabot by default.
  3. Auditability: reviewers/CI can't easily verify the fork's changes against upstream.

Recommended mitigations:

  • Pin to an immutable commit SHA (already done with the long hash) ✓
  • Document the upstream PR number in go.mod or a comment so the fork lifecycle is trackable
  • Set a calendar reminder / issue to re-evaluate once upstream merges
// replace pending upstream PR: https://github.com/firebase/genkit/pull/NNNN
replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0...

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 6, 2026

Claude finished @Anthony-Bible's task in 4m 30s —— View job


PR Review: feat/genkit-adapter-with-warmup

  • Gather context and understand the PR
  • Read changed files — genkit_adapter.go, warmup.go, system_prompt.go, serve.go, config changes
  • Review domain/port changes
  • Review Genkit adapter implementation
  • Review warmup implementation
  • Review webhook/readiness changes
  • Review config/container changes
  • Review tool executor changes
  • Post comprehensive review feedback

What Previous Reviews Got Fixed ✅

Two significant items from prior reviews are now resolved:

  • Shared system promptsystem_prompt.go is a new shared helper; both adapters call resolveSystemPrompt, eliminating the drift risk from duplicated prompt logic.
  • Shared tool-param conversionport.ToolParamFromEntity() is now the single source of truth, used by both conversation_service.go and warmup.go. The warmup fingerprint and the live-request fingerprint will stay identical as new fields are added.

Issues That Remain

Bug — Clone() shares the map without sharing the mutex (genkit_adapter.go:278–291)

nameMapMu is a value-type sync.RWMutex. Clone() does not copy it, so the clone gets a fresh zero-value mutex while sharing the same toolNameMap by reference. The struct comment says the map is "populated exactly once during construction … only ever read after init", but registration is lazyregisterTools is called on every SendMessage, not in the constructor. If a clone's first SendMessage runs concurrently with the original's first SendMessage, both hold different mutexes over the same map. Go's race detector will flag this.

In practice the race window is narrow (it only opens before the original's first call completes), but it's an unsynchronized write to a shared map — undefined behavior under the memory model.

Fix: change nameMapMu to a pointer (*sync.RWMutex) and share it across clones, or replace the pair with sync.Map:

// in Clone():
nameMapMu:   a.nameMapMu,   // share the pointer, not a new zero value
toolNameMap: a.toolNameMap,

Fix this →

Bug — [No content received from AI] injected on normal tool-use turns (genkit_adapter.go:607–609)

When the model responds with only tool calls (the common agentic loop), contentBuilder is empty and the code falls through to:

if content == "" {
    content = "[No content received from AI]"
}

This string goes into conversation history as the assistant's text, which could confuse the model on the next turn. AnthropicAdapter handles this more gracefully — it first tries content = string(response.StopReason), so tool-only turns get "tool_use" as content rather than a fake error string.

The safest fix is to allow empty content when tool calls are present:

if content == "" && len(entityToolCalls) == 0 {
    content = "[No content received from AI]"
}

Fix this →

Bug — ListAgents error fails the entire AI call (system_prompt.go:87–89)

buildBasePromptWithAgents propagates a ListAgents error upward:

agents, err := subagentManager.ListAgents(ctx)
if err != nil {
    return "", fmt.Errorf("failed to list agents: %w", err)
}

This propagates through resolveSystemPrompt → getSystemPrompt → buildGenerateOptions → SendMessage, causing the user's request to fail because of an optional enrichment step (listing subagents for the system prompt). A transient filesystem error or missing agents/ directory will abort the conversation.

Should degrade gracefully — the agent list is a nice-to-have, not required for correctness:

if err != nil {
    // best-effort: proceed without agent list rather than failing the request
    return basePrompt, nil
}

Fix this →

Design — Warmup fires unconditionally for all providers (serve.go:224–225)

webhookAdapter.SetWarmupRequired(true)
go runWarmup(context.Background(), container, webhookAdapter, container.Logger())

This runs regardless of AGENT_AI_PROVIDER. With the default anthropic provider and the default model (hf:zai-org/GLM-4.7), the warmup issues a real Anthropic API call at every serve startup that previously didn't exist. That's an unintentional behavior change for existing users — they now pay a startup API call and /ready blocks until it completes (or times out after 3 minutes on failure).

Recommend gating on a config flag or on AGENT_AI_PROVIDER=genkit:

if cfg.AIProvider == config.ProviderGenkit {
    webhookAdapter.SetWarmupRequired(true)
    go runWarmup(context.Background(), container, webhookAdapter, container.Logger())
}

Fix this →


Smaller Observations

No tests for warmup.goWarmCache and ToolParamsFromExecutor have no unit tests. These are the core correctness paths for the warmup feature; a regression in Strict field propagation would be silent. The functions are straightforward to test with mock ports:

  • WarmCache(ctx, nil, tools, log) → error
  • WarmCache(ctx, provider, nil, log) → skip (logs "no tools")
  • WarmCache(ctx, failingProvider, tools, log) → wraps error, still returns (caller logs and continues)
  • ToolParamsFromExecutor(nil) → error
  • ToolParamsFromExecutor(executor)Strict bit propagated correctly

Tool result wrapping (genkit_adapter.go:543–548) — Results are sent as {"result":"..."} / {"error":"..."} rather than raw strings. The test at line 251 verifies this shape, so it's intentional. Worth confirming the Genkit Anthropic plugin unwraps this correctly before forwarding to the API, since the model may have learned to expect flat strings.

replace directive without upstream PR link (go.mod:61) — The fork is pinned to a commit SHA (good for reproducibility), but there's no comment referencing upstream PRs for the WithStrict API or the two bug fixes. Add a comment so the fork's lifecycle is trackable:

// replace pending upstream PRs: firebase/genkit#NNNN (WithStrict), firebase/genkit#MMMM (strict:false serialization)
replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go ...

sanitizeToolName collision after truncation (genkit_adapter.go:464–467) — Two tools with different names but identical 64-char prefixes will silently map to the same sanitized key. The code at line 384 logs a warn when existingOrig != t.Name, so the collision won't be fully silent — but it's worth a test case.


What's Working Well

  • The atomic.Bool pair for warmupRequired/warmupReady is clean and race-free.
  • defineToolSafe converting Genkit's panics to errors is the right defensive pattern for a process-global registry.
  • SetWarmupRequired(false) as the default preserves legacy behavior — existing callers that don't warm up aren't broken.
  • The plugin abstraction (genkitPluginAdapter) is well-designed: adding a new provider is a single new file + init() registration.
  • Deterministic ListTools sort is in the right place and correctly explained.
  • ToolParamFromEntity in the port layer is a clean fix — it's exactly where the single source of truth should live.
  • Test coverage for sanitizeToolName, convertResponse, Clone isolation, and message conversion is solid and parallel.

Anthony-Bible added a commit that referenced this pull request May 6, 2026
…up gating

- GenkitAdapter.nameMapMu is now a *sync.RWMutex shared with clones, so
  concurrent first registration of tools across original+clones serializes
  on a single mutex instead of per-clone zero-value mutexes racing on the
  shared toolNameMap.
- convertResponse mirrors AnthropicAdapter: when the model returns only
  tool calls (the common agentic-loop case) fall back to FinishReason
  rather than injecting "[No content received from AI]" into history.
- serve gates SetWarmupRequired/runWarmup on AIProvider == genkit so
  AnthropicAdapter callers don't pay an unintentional startup API call
  and /ready delay.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 6, 2026

Claude finished @Anthony-Bible's task in 2m 55s —— View job


PR Review: feat/genkit-adapter-with-warmup (round 4)

  • Read current state of changed files
  • Check which previously identified bugs were addressed (commit e565e34)
  • Review remaining open issues
  • Post comprehensive updated review

What Was Fixed in This Round ✅

Commit e565e34 addressed all three critical/significant bugs from the previous review:

Clone() data race — fixed. nameMapMu is now *sync.RWMutex (pointer), and Clone() explicitly shares both the mutex and the map by reference. The struct comment explains the design clearly. This is the correct fix.

[No content received from AI] on tool-only turns — fixed. The fallback chain now mirrors AnthropicAdapter exactly:

// genkit_adapter.go ≈ anthropic_adapter.go — identical priority order
if content == "" && resp.FinishReason != "" { content = string(resp.FinishReason) }  // "tool_use"
if content == "" && len(thinkingBlocks) > 0 { content = "[AI internal reasoning completed]" }
if content == "" { content = "[No content received from AI]" }

Tool-only agentic turns now get "tool_use" as history content, not a fake error string.

Warmup fires unconditionally — fixed. serve.go:229 now gates the entire warmup block on cfg.AIProvider == config.ProviderGenkit. Existing AnthropicAdapter users see no change in startup behavior.


One Remaining Bug

ListAgents error fails the entire AI call (system_prompt.go:86–89)

agents, err := subagentManager.ListAgents(ctx)
if err != nil {
    return "", fmt.Errorf("failed to list agents: %w", err)
}

This error propagates through buildBasePromptWithAgents → resolveSystemPrompt → getSystemPrompt → buildGenerateOptions → SendMessage, causing the user's AI request to fail due to an optional enrichment step. A transient agents/ directory read error or missing directory will abort the conversation entirely.

AnthropicAdapter has the same bug (it calls the same shared resolveSystemPrompt), so this is a pre-existing issue surfaced by extracting the shared helper. The fix is straightforward — degrade gracefully instead of failing:

agents, err := subagentManager.ListAgents(ctx)
if err != nil {
    // best-effort: agent list is optional enrichment, not required for correctness
    return basePrompt, nil
}

Fix this →


Smaller Remaining Observations

No tests for warmup.go — Three previous review rounds have flagged this. WarmCache and ToolParamsFromExecutor are still untested. The Strict bit propagation through ToolParamFromEntity is the core correctness claim of this feature. Simple table-driven tests with a mock AIProvider and ToolExecutor would cover the most important paths:

  • WarmCache(ctx, nil, tools, log) → error
  • WarmCache(ctx, provider, nil, log) → skip gracefully
  • ToolParamsFromExecutor(nil) → error
  • ToolParamsFromExecutor(executor)Strict field propagated correctly

replace directive has no upstream PR link (go.mod:61) — Pinned to a commit SHA which is good for reproducibility, but there's nothing referencing upstream PRs for the WithStrict API and the two bug fixes. A comment here makes the fork lifecycle trackable:

// replace pending upstream PRs: firebase/genkit#NNNN (WithStrict), firebase/genkit#MMMM (strict:false serialization)
replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0...

Silent tool name collision (genkit_adapter.go:389) — The sanitization-occurred warning is already logged. But when two tools with different original names both sanitize to the same key (e.g., weird.tool and weird_toolweird_tool), the code updates toolNameMap with the second tool's name while Genkit's registry holds the first tool's schema. The mismatch is silent. A one-liner guard would surface it:

if alreadyRegistered && existingOrig != t.Name {
    a.logger.Warn("tool name collision after sanitization — two tools share the same sanitized name; first tool's schema will be used",
        "sanitized", sanitized, "first_original", existingOrig, "second_original", t.Name)
}

What's in Good Shape

  • The nameMapMu *sync.RWMutex + struct comment is a clean, well-documented fix.
  • ToolParamFromEntity in the port layer is exactly the right place for that conversion.
  • system_prompt.go shared helper ensures both adapters produce identical prompts.
  • The if cfg.AIProvider == config.ProviderGenkit gate preserves backward compatibility for existing users.
  • defineToolSafe recovering Genkit panics is the right defensive pattern.
  • sort.Slice in ListTools for deterministic tool order is correctly placed.
  • Test coverage for sanitizeToolName, convertResponse, Clone isolation, message conversion, and streaming is comprehensive.

Summary: Three of four previously flagged issues are now resolved. The one remaining bug — ListAgents errors aborting SendMessage — is a one-line fix in system_prompt.go that also benefits AnthropicAdapter. The warmup tests and go.mod comment are low-effort improvements worth doing before merge.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 6, 2026

Claude finished @Anthony-Bible's task in 2m 21s —— View job


PR Review: feat/genkit-adapter-with-warmup (round 5)

  • Gather context and understand the PR
  • Read latest commits and changed files
  • Check fix for ListAgents error propagation (system_prompt.go)
  • Review any new changes since round 4
  • Check warmup tests status
  • Post comprehensive updated review

What Was Fixed in This Round ✅

Commit a8da18c resolved the last critical bug from round 4:

ListAgents error fails entire AI call — fixed. buildBasePromptWithAgents now returns basePrompt, nil on error instead of propagating the failure. The //nolint:nilerr comment correctly documents the intentional behavior. A transient agents/ read error or missing directory no longer aborts the user's AI request.


Complete Bug Scorecard

All four previously-open bugs are resolved:

Issue Status
Clone() data race (nameMapMu is value, not pointer) ✅ Fixed (e565e34)
[No content received from AI] on normal tool-use turns ✅ Fixed (e565e34)
Warmup fires unconditionally for all providers ✅ Fixed (e565e34)
ListAgents error aborts SendMessage ✅ Fixed (a8da18c)

Remaining Low-Risk Items

These have been flagged across multiple review rounds. None are blockers for correct behavior, but two have merge implications.

No tests for warmup.go (flagged all 5 rounds)

WarmCache and ToolParamsFromExecutor still have zero test coverage. ToolParamsFromExecutor is the code path that propagates the Strict field — the core correctness claim of this PR — and there's nothing catching a regression if port.ToolParamFromEntity changes. The functions are simple to test with mock ports:

func TestWarmCache_NilProvider(t *testing.T) {
    err := WarmCache(context.Background(), nil, someTools, port.NopLogger{})
    require.ErrorContains(t, err, "AI provider is nil")
}

func TestWarmCache_NoTools(t *testing.T) {
    err := WarmCache(context.Background(), &mockProvider{}, nil, port.NopLogger{})
    require.NoError(t, err) // skips gracefully
}

func TestToolParamsFromExecutor_StrictPropagated(t *testing.T) {
    exec := &mockExecutor{tools: []entity.Tool{{Name: "bash", Strict: true}}}
    params, err := ToolParamsFromExecutor(exec)
    require.NoError(t, err)
    require.True(t, params[0].Strict)
}

Fix this →

Silent tool name collision after truncation (genkit_adapter.go:389)

When two tools with different original names both sanitize to the same key (e.g., weird.tool and weird_tool both become weird_tool), the code at line 389 updates toolNameMap with the second tool's name while Genkit's registry holds the first tool's schema. On the AI's next call, the domain layer resolves the sanitized name to the second tool's original name, but the API validated inputs against the first tool's schema. This is silent.

The existing warn at line 376 fires for sanitization occurring but doesn't catch same-key collisions between a sanitized name and a native one. A one-liner guard in the alreadyRegistered && existingOrig != t.Name path would surface it:

if alreadyRegistered && existingOrig != t.Name {
    a.logger.Warn("tool name collision after sanitization — two tools share the same sanitized key; first tool's schema will be used for both",
        "sanitized", sanitized, "first_original", existingOrig, "second_original", t.Name)
}

Fix this →

replace directive without upstream PR link (go.mod:61)

replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0.20260506115940-43381ac9eb96

The commit SHA pin is good for reproducibility. Still missing a comment referencing the upstream PRs for WithStrict and the two bug fixes, which is the only way to track when the replace can be dropped:

// replace pending upstream PRs: firebase/genkit#NNNN (WithStrict), firebase/genkit#MMMM (strict:false serialization)
replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0...

What's in Good Shape

  • buildBasePromptWithAgents graceful degradation is clean — the //nolint:nilerr annotation with inline explanation is the right pattern for intentional error suppression.
  • nameMapMu *sync.RWMutex shared across clones with a clear struct comment is a solid fix.
  • ToolParamFromEntity in the port layer is exactly the right home for that conversion.
  • string(resp.FinishReason) as the tool-only turn fallback mirrors AnthropicAdapter exactly.
  • The if cfg.AIProvider == config.ProviderGenkit gate in serve.go preserves backward compatibility.
  • system_prompt.go as a shared helper guarantees both adapters produce identical prompt text.
  • The atomic.Bool pair for warmupRequired/warmupReady is clean and race-free.
  • Test coverage for sanitizeToolName, convertResponse, Clone isolation, and message conversion is comprehensive.

Summary: Five review rounds, four bugs fixed. The PR is in good shape and the remaining items are all low-risk. The warmup tests are the highest-value remaining addition — they're the only missing coverage for the feature's core correctness claim.

…ction

Adds a parallel AIProvider backed by Firebase Genkit Go's Anthropic
plugin, selectable via AGENT_AI_PROVIDER (default: anthropic). Per-tool
strict-schema validation lets critical tools (bash, edit_file, task,
delegate) opt in while others leave strict off to avoid rejected
JSON-schema keywords.

Extracts a genkitPluginAdapter interface + process-global registry so
non-Anthropic Genkit plugins can be added by dropping in a file with
RegisterGenkitPlugin. Adds AGENT_GENKIT_PLUGIN config (default:
"anthropic").

Primes the AI grammar cache at startup with an async warmup goroutine
so /ready returns 503 until the cache is warm — k8s rolling deploys
won't route traffic to a cold pod. Warmup failures still flip the gate
(cold cache = latency penalty, not a correctness problem).

Fixes Clone race on shared toolNameMap (sync.RWMutex), tool-only
response fallback, and graceful degradation when ListAgents fails.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Anthony-Bible Anthony-Bible force-pushed the feat/genkit-adapter-with-warmup branch from a8da18c to 003481c Compare May 6, 2026 14:22
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 6, 2026

Claude finished @Anthony-Bible's task in 1m 57s —— View job


PR Review: feat/genkit-adapter-with-warmup (round 6)

  • Gather context — check new commits since round 5
  • Review current state of warmup.go (tests added?)
  • Review genkit_adapter.go (tool collision warning added?)
  • Review go.mod (upstream PR link added?)
  • Check for any new issues introduced
  • Post comprehensive updated review

Summary

This commit is a squash of the entire branch. All four bugs fixed across rounds 2–5 are present in this squashed state. No new issues were introduced. Three low-risk items flagged across every previous round are still open — two are trivial one-liners, one (tests) requires ~30 lines of new code.


Bug Scorecard — All Fixed ✅

Issue Status
Clone() data race — nameMapMu value vs. pointer ✅ Fixed (nameMapMu *sync.RWMutex, shared in Clone())
[No content received from AI] on normal tool-use turns ✅ Fixed (uses resp.FinishReason"tool_use" as fallback)
Warmup fires unconditionally for all providers ✅ Fixed (gated on cfg.AIProvider == config.ProviderGenkit)
ListAgents error aborts SendMessage ✅ Fixed (//nolint:nilerr intentional graceful degradation)

Three Remaining Items (all low-risk, flagged every round)

1. No tests for warmup.go (flagged 6 rounds in a row)

WarmCache and ToolParamsFromExecutor still have zero test coverage. ToolParamsFromExecutor calls port.ToolParamFromEntity — the code path that propagates the Strict field. That is the core correctness claim of this PR. A regression there (e.g., a future contributor zeroing Strict in ToolParamFromEntity) would be silent and wouldn't break a single test.

The functions are straightforward to test with mock ports (no Genkit initialization required):

func TestWarmCache_NilProvider(t *testing.T) {
    err := WarmCache(context.Background(), nil, someTools, port.NopLogger{})
    require.ErrorContains(t, err, "AI provider is nil")
}

func TestWarmCache_NoTools(t *testing.T) {
    err := WarmCache(context.Background(), &mockProvider{}, nil, port.NopLogger{})
    require.NoError(t, err) // skips gracefully
}

func TestToolParamsFromExecutor_StrictPropagated(t *testing.T) {
    exec := &mockExecutor{tools: []entity.Tool{{Name: "bash", Strict: true}}}
    params, err := ToolParamsFromExecutor(exec)
    require.NoError(t, err)
    require.True(t, params[0].Strict)
}

Fix this →

2. Silent tool name collision after sanitization (genkit_adapter.go:389–393)

When two tools with different original names both sanitize to the same key (e.g., "weird.tool" and "weird_tool" → both become "weird_tool"), the second tool's name silently overwrites the first in toolNameMap while Genkit's registry keeps the first tool's schema. The AI then calls the second tool by name, but API validates inputs against the first tool's schema — a mismatch that produces wrong results with no visible error.

The alreadyRegistered && existingOrig != t.Name condition already detects this; it just doesn't log:

if !alreadyRegistered || existingOrig != t.Name {
    if alreadyRegistered && existingOrig != t.Name {
        a.logger.Warn("tool name collision after sanitization — two tools share the same sanitized key; first tool's schema will be used for both",
            "sanitized", sanitized, "first_original", existingOrig, "second_original", t.Name)
    }
    a.nameMapMu.Lock()
    a.toolNameMap[sanitized] = t.Name
    a.nameMapMu.Unlock()
}

Fix this →

3. replace directive has no upstream PR link (go.mod:61)

replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0.20260506115940-43381ac9eb96

No reference to the upstream PRs for WithStrict and the two bug fixes. Without a link, there is no way to track when the fork can be dropped and no signal for code reviewers auditing the dependency tree. A one-line comment is enough:

// replace pending upstream PRs: firebase/genkit#NNNN (WithStrict per-tool), firebase/genkit#MMMM (strict:false serialization fix)
replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0...

Fix this →


What's in Good Shape

  • nameMapMu *sync.RWMutex shared across clones with a clear struct comment is correct and well-documented.
  • convertResponse content fallback mirrors AnthropicAdapter exactly — tool-only turns get "tool_use" not a fake error string.
  • buildBasePromptWithAgents graceful degradation with //nolint:nilerr is the right pattern for intentional error suppression.
  • cfg.AIProvider == config.ProviderGenkit gate in serve.go preserves backward compatibility for all existing users.
  • port.ToolParamFromEntity in the port layer is the right single source of truth for domain-to-wire conversion.
  • system_prompt.go shared helper guarantees both adapters produce identical prompt text.
  • atomic.Bool pair for warmupRequired/warmupReady is race-free.
  • defineToolSafe recovering Genkit panics keeps the constructor signature honest and prevents process crashes.
  • Deterministic sort.Slice in ListTools is correctly placed for cache-key stability.
  • Test coverage for sanitizeToolName, convertResponse, Clone isolation, message conversion, and streaming is comprehensive and parallel.

Summary: All critical bugs from previous rounds are fixed in the squashed commit. The PR is functionally correct. The three remaining items are: missing warmup tests (highest value — covers the Strict propagation correctness claim), a missing warn log for silent tool name collision, and a missing upstream PR link comment in go.mod. None block correct operation, but the tests in particular are worth adding before merge.

Renames the per-tool strict opt-in to WithStrictSchema to match the
updated genkit fork API.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 12, 2026

Claude finished @Anthony-Bible's task in 1m 36s —— View job


PR Review: feat/genkit-adapter-with-warmup (round 7)

  • Gather context and understand the PR
  • Check current state of files vs. round 6 findings
  • Review warmup.go and warmup_test.go (tests flagged 6 rounds)
  • Review genkit_adapter.go (tool name collision warning)
  • Review go.mod (upstream PR link)
  • Check for any new issues introduced
  • Post comprehensive updated review

What Changed in This Round

Commit b4b935d makes a single targeted fix: renames WithStrict(bool)WithStrictSchema(bool) in genkit_adapter.go:449 and bumps the fork's pinned commit SHA in go.mod to match the updated upstream API. No logic changed.


New Minor Issue (introduced by b4b935d)

Stale comment at genkit_adapter.go:360

The comment above registerTools still references the old name:

// using ai.WithStrict(false) (provided by the fork) so opt-out of strict

Should read ai.WithStrictSchema(false) to match the actual call at line 449. Low severity — just a comment — but it will mislead anyone cross-referencing the fork's API.

Fix this →


Three Items Still Open (carried from round 6)

1. No tests for warmup.go (flagged every round — 7 now)

warmup.go still has zero test coverage. ToolParamsFromExecutor is the direct caller of port.ToolParamFromEntity — the code path that propagates the Strict field. That is the core correctness claim of this PR, and a regression there (e.g., someone zeroing Strict in ToolParamFromEntity) would be completely silent. The functions are straightforward to test with mock ports (no Genkit initialization required):

func TestWarmCache_NilProvider(t *testing.T) {
    err := WarmCache(context.Background(), nil, someTools, port.NopLogger{})
    require.ErrorContains(t, err, "AI provider is nil")
}

func TestWarmCache_NoTools(t *testing.T) {
    err := WarmCache(context.Background(), &mockProvider{}, nil, port.NopLogger{})
    require.NoError(t, err) // skips gracefully
}

func TestToolParamsFromExecutor_StrictPropagated(t *testing.T) {
    exec := &mockExecutor{tools: []entity.Tool{{Name: "bash", Strict: true}}}
    params, err := ToolParamsFromExecutor(exec)
    require.NoError(t, err)
    require.True(t, params[0].Strict)
}

Fix this →

2. Silent tool name collision after sanitization (genkit_adapter.go:389–393)

When two tools with different original names both sanitize to the same key (e.g., "weird.tool" and "weird_tool" both become "weird_tool"), the second tool's name silently overwrites toolNameMap while Genkit's registry keeps the first tool's schema. The condition at line 389 already detects this — it just doesn't log:

if !alreadyRegistered || existingOrig != t.Name {
    if alreadyRegistered && existingOrig != t.Name {
        a.logger.Warn("tool name collision after sanitization — two tools share the same sanitized key; first tool's schema will be used for both",
            "sanitized", sanitized, "first_original", existingOrig, "second_original", t.Name)
    }
    a.nameMapMu.Lock()
    a.toolNameMap[sanitized] = t.Name
    a.nameMapMu.Unlock()
}

Fix this →

3. replace directive has no upstream PR link (go.mod:61)

The SHA was bumped in this commit but there's still no comment tracking the upstream PRs:

replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0.20260511152837-c64d115736af

Without a reference to the upstream firebase/genkit PRs for WithStrictSchema and the bug fixes, there's no way for future reviewers to know when the replace can be dropped:

// replace pending upstream PRs: firebase/genkit#NNNN (WithStrictSchema per-tool), firebase/genkit#MMMM (strict:false serialization fix)
replace github.com/firebase/genkit/go => github.com/Anthony-Bible/genkit/go v1.7.1-0...

Fix this →


What's in Good Shape

Everything from round 6 remains solid. The WithStrictSchema rename is correct — the call site at genkit_adapter.go:449 matches the updated fork API, and go.sum was updated consistently. No regressions introduced.


Summary: Seven review rounds, four critical bugs all fixed. The PR is functionally correct. The three remaining items are cosmetic/operational (stale comment, missing warn log, missing upstream PR link). The warmup tests are the highest-value remaining addition before merge — they provide the only regression coverage for the Strict field propagation, which is the core correctness claim of this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant