Skip to content

feat(js/plugins/compat-oai): add OpenAI Responses API support#5237

Draft
DenisovAV wants to merge 4 commits intogenkit-ai:mainfrom
DenisovAV:denisovav/feat-compat-oai-responses-api
Draft

feat(js/plugins/compat-oai): add OpenAI Responses API support#5237
DenisovAV wants to merge 4 commits intogenkit-ai:mainfrom
DenisovAV:denisovav/feat-compat-oai-responses-api

Conversation

@DenisovAV
Copy link
Copy Markdown

Summary

Adds opt-in OpenAI Responses API (/v1/responses) support to @genkit-ai/compat-oai via a sibling openAIResponses() plugin and an openAI.responsesModel(...) helper. Closes the gap discussed in #5236 (and previous #3640 / #4687 / #3574).

import openAI, { openAIResponses } from '@genkit-ai/compat-oai/openai';

const ai = genkit({ plugins: [openAI(), openAIResponses()] });

await ai.generate({
  model: openAI.responsesModel('gpt-5-mini'),
  prompt: 'Latest news about X?',
  config: {
    builtInTools: [{ type: 'web_search_preview' }],
    reasoning: { effort: 'medium' },
  },
});

This is filed as draft for design feedback before final review — see #5236 for the open question on sibling plugin vs prefix dispatch.

What's covered

  • Non-streaming via client.responses.create
  • Streaming via client.responses.stream with a per-output_index SSE event aggregator (text deltas, reasoning summary deltas, function-call argument aggregation, built-in tool lifecycle progress, citation buffering)
  • Built-in tools: web_search_preview, file_search, code_interpreter
  • Reasoning models (o1/o3/o4-mini/gpt-5*): leading text-only system messages auto-lifted to instructions; non-text system content stays in input
  • Stateful chaining: config.previousResponseIdprevious_response_id; current turn's id surfaces on response.custom.responseId
  • Citations from built-in tools surface as metadata.citations on text Parts. Discriminated Citation type — url_citation | file_citation — forward-compatible if Genkit ever adds a first-class citation Part.
  • Privacy default: store: false (opposite of OpenAI's default true); documented in README and overridable per-call.
  • Errors: APIError mapped to GenkitError (status mirroring), non-API errors wrapped as INTERNAL (or CANCELLED on abort), stream-level error events captured and re-thrown rather than silently truncating.
  • Malformed JSON in function-call arguments: surfaced as raw string with metadata.malformedArguments: true, plus logger.warn (no silent corruption).

Isolation

  • src/model.ts is not modified.
  • Other compat providers (src/deepseek/, src/xai/) are not modified.
  • New regression test (tests/compat_oai_isolation_test.ts) verifies that constructing the deepseek/xai plugins does not load any responses/* files.
  • All 95 existing tests in this package continue to pass unchanged.

Tests

  • 138/138 unit tests passing (95 baseline + 28 non-streaming Responses + 10 streaming + 5 isolation/edge cases I'm forgetting to count separately — see test files)
  • pnpm check clean (tsc --noEmit)
  • pnpm build clean (tsup DTS + JS)
  • Live-API smoke against gpt-5-mini (scripts/smoke_responses.ts, gated on OPENAI_API_KEY): plain text, web-search citations, streaming, previousResponseId chaining — all green

Files

  • New (~1.5 KLOC):
    • src/openai/responses/{types,request,response,runner,stream,index}.ts
    • tests/openai_responses_test.ts
    • tests/openai_responses_stream_test.ts
    • tests/compat_oai_isolation_test.ts
    • scripts/smoke_responses.ts
  • Modified (~30 LOC, additive only):
    • src/openai/index.tsopenAIResponses() factory, openAI.responsesModel(...) helper, type overloads
    • README.md — new "Using the OpenAI Responses API" section
  • No SDK version bump required — openai 4.x exposes client.responses.* already.

Open questions (from #5236)

  1. Sibling plugin vs prefix dispatch. Current PR is a sibling plugin (mirrors xai/deepseek). Happy to refactor to a single-plugin prefix dispatch in openAI()'s resolver if preferred — moves namespace registration boilerplate but keeps user surface (openAI.responsesModel(...)) identical.
  2. metadata.citations vs first-class Part type. metadata.citations keeps Genkit core untouched and is forward-compatible. Open to filing a separate core RFC for first-class citations as a follow-up.
  3. store: false default. Opposite of OpenAI's default. Privacy-by-default for plugin users; documented. Happy to flip if maintainers prefer parity with OpenAI.

Test plan

  • pnpm test --filter @genkit-ai/compat-oai
  • pnpm check --filter @genkit-ai/compat-oai
  • pnpm build --filter @genkit-ai/compat-oai
  • Live-API smoke (plain text, web_search citations, streaming, chaining) against gpt-5-mini
  • Regression: 95 baseline tests unchanged
  • Isolation: deepseek/xai plugin construction does not load responses/*

Adds an opt-in `openAIResponses()` companion plugin and an
`openAI.responsesModel(...)` helper that target the OpenAI Responses
API (POST /v1/responses). Built-in tools (web_search_preview,
file_search, code_interpreter), reasoning summaries, and stateful
conversations via previous_response_id are supported. Citations from
built-in tools surface as `metadata.citations` on text Parts.

The new plugin is a sibling of the existing `openAI()` plugin and
mirrors the deepseek/xai pattern so other compat providers and the
existing Chat Completions path are unaffected. A regression test
verifies that constructing the deepseek/xai plugins does not load any
of the new responses/* files.

Files:
- js/plugins/compat-oai/src/openai/responses/{types,request,response,
  runner,stream,index}.ts — implementation (~1.4 KLOC)
- js/plugins/compat-oai/src/openai/index.ts — `openAIResponses()`
  factory + `openAI.responsesModel(...)` helper
- js/plugins/compat-oai/tests/openai_responses_test.ts (28 cases)
- js/plugins/compat-oai/tests/openai_responses_stream_test.ts
  (10 cases) — SSE event aggregator
- js/plugins/compat-oai/tests/compat_oai_isolation_test.ts — checks
  responses/* are NOT loaded by deepseek/xai
- js/plugins/compat-oai/scripts/smoke_responses.ts — manual live-API
  smoke (gated on OPENAI_API_KEY) covering plain text, web search
  citations, streaming, and previousResponseId chaining
- README — new "Using the OpenAI Responses API" section

Discussion: genkit-ai#5236
@github-actions github-actions Bot added docs Improvements or additions to documentation js labels May 5, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the OpenAI Responses API (/v1/responses) to the compat-oai plugin, introducing the openAIResponses() companion plugin and responsesModel() helper. These changes enable the use of built-in tools, reasoning models, and stateful conversation chaining. The review identified a critical configuration issue where systemRole must be set to true for reasoning models to ensure system messages reach the plugin's custom lifting logic. Additionally, a fix was suggested to handle undefined tool outputs during serialization to prevent invalid API requests.

media: true,
// o1/o3/gpt-5 reasoning family ignores `system` role; instructions
// must go via the `instructions` config field instead.
systemRole: false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Setting 'systemRole: false' here will cause Genkit's core 'generate' logic to perform role mapping (e.g., converting 'system' messages to 'user' messages with a prefix) before the request reaches this plugin. This effectively bypasses the custom lifting logic implemented in 'toResponsesRequestBody' (which specifically looks for 'role === 'system''). To ensure your custom lifting to the 'instructions' field works as intended for reasoning models, you should set 'systemRole: true' here so the plugin receives the original system messages.

Suggested change
systemRole: false,
systemRole: true,

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the change in 5f6077f + e912611systemRole: true is the right value.

Small correction on the mechanism though: I checked Genkit core (js/ai/src/) and systemRole is purely declarative metadata — core does not auto-convert system → user based on it. simulateSystemPrompt is opt-in middleware, not core behaviour. The end result is the same (the plugin lifts text-only system messages into instructions itself before the request leaves) so systemRole: true is the accurate self-description from a Genkit consumer's POV. Comment in types.ts updated to reflect the actual mechanism, and there's a new e2e test that exercises the lift against o3.

Comment on lines +120 to +123
typeof part.toolResponse.output === 'string'
? part.toolResponse.output
: JSON.stringify(part.toolResponse.output),
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If 'part.toolResponse.output' is 'undefined', 'JSON.stringify(undefined)' will return 'undefined', which results in the 'output' property being omitted from the 'function_call_output' item. Since the OpenAI Responses API expects a string for the tool output, it is safer to provide a fallback value (like an empty JSON object string) to ensure the request remains valid.

            output:
              typeof part.toolResponse.output === 'string'
                ? part.toolResponse.output
                : JSON.stringify(part.toolResponse.output ?? {}),

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 5f6077foutput: JSON.stringify(part.toolResponse.output ?? {}). Added a unit test asserting output: '{}' for the undefined case.

DenisovAV added 2 commits May 6, 2026 10:34
Two issues from automated review on PR genkit-ai#5237:

1. Reasoning models (o1/o3/o4-mini/gpt-5*) now advertise
   `systemRole: true`. The plugin lifts text-only system messages into
   the top-level `instructions` field itself; advertising
   `systemRole: false` would cause Genkit core to convert system → user
   messages BEFORE the request reaches our resolver, defeating the
   lift. Added a unit test asserting `o3` and `o4-mini` carry
   `supports.systemRole === true`.

2. Tool messages with undefined `toolResponse.output` no longer drop
   the `output` field (which would happen because
   `JSON.stringify(undefined) === undefined`). They now emit
   `output: '{}'` so the API request stays valid. Added a unit test.
…g-model lift e2e test

The reasoning behind `systemRole: true` in REASONING_MODEL_INFO was
mis-stated in the previous commit. Genkit core does not
auto-transform messages based on `systemRole`; the value is purely
declarative metadata for plugin consumers. The change is still
correct (the plugin handles system→instructions lift internally so
the model effectively does support system role from a Genkit POV),
but the inline comment now describes the actual mechanism instead of
a non-existent core transformation.

Also updates the stale JSDoc on `toResponsesRequestBody` which still
referenced `supports.systemRole === false` as the lift trigger; the
real trigger is `config.instructions == null`, applied for all
Responses-namespaced models.

Adds an e2e unit test that exercises the lift end-to-end against a
reasoning model id (`o3`) and asserts both `instructions` is
populated and the input array carries no `system` role.
@MichaelDoyle
Copy link
Copy Markdown
Contributor

Thanks @DenisovAV! Are you working with anybody on the core team right now? If not, let me see who can help review and get this merged.

…t:check)

The previous commits ran prettier from the plugin's own node_modules
which slightly disagreed with the monorepo-root prettier 3.5.3 config
on a couple of import-line breaks. Re-applies the canonical monorepo
formatting to satisfy CI's `pnpm format:check`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation js

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants