[codex] fix server LLM model auto-selection#577
Merged
wyuc merged 2 commits intoMay 17, 2026
Conversation
Summary: - Auto-fill the first server-configured LLM model when provider sync falls back from an unusable default provider. - Add a regression test for the DeepSeek server-only configuration reported in THU-MAIC#498. - Keep the OpenAI-compatible fetch wrapper buildable with Next's fetch type. Rationale: - Server-configured providers were marked usable, but an empty modelId still blocked classroom entry. Tests: - pnpm vitest run tests/store/settings-server-sync.test.ts -t "selects the server LLM model" - pnpm vitest run tests/store/settings-server-sync.test.ts - pnpm test - pnpm lint && pnpm check && pnpm build - CI=1 pnpm test:e2e Co-authored-by: Codex <codex@openai.com>
Closed
wyuc
approved these changes
May 17, 2026
Contributor
wyuc
left a comment
There was a problem hiding this comment.
Approving. Verified locally: reverting lib/store/settings.ts to parent makes exactly the new regression test fail; an e2e repro (default openai + empty modelId + server-only deepseek) is blocked by the model toast without this change and enters generation with it.
Nit (non-blocking): LLM fallback is gated by isServerConfigured while image/video fall back to models[0] unconditionally — asymmetry from #266; worth a later follow-up.
cosarah
pushed a commit
that referenced
this pull request
May 18, 2026
…580) (#581) * fix(settings): enforce "usable provider ⇒ concrete model" invariant (#580) Layer 1 — resolve provider AND model atomically at every source: - add resolveSelectedModel(): like validateModel but no empty-input short-circuit, so it returns '' only when the provider has no models - fetchServerProviders: symmetric LLM/image/video provider recovery + resolveSelectedModel; remove the asymmetric #577 isServerConfigured-gated LLM tail and the separate first-load auto-select block (now subsumed, and covers client-API-key providers too) - setProviderConfig: entering an API key for the active/unset LLM provider resolves a concrete model in the same set() - setVideoProvider: resolve a model on switch (was id-only); setImageProvider routed through the shared resolver Layer 2 — remove states unreachable under the invariant: - add shared isLLMProviderConfigured / hasUsableLLMProvider; gate the landing-page generate button on a usable provider (state A vs B) and drop the modelNotConfigured toast + forced SettingsDialog - generation-toolbar: use the shared predicate; drop the dead "Select Model" label/tooltip fallback - delete dead components/settings/model-selector.tsx (zero refs) and the now-orphan settings.selectModel i18n key (6 locales) Note: generation-toolbar's `providerEntries.length === 0` branch is KEPT — it is reachable via search (search-no-results), not the dead no-provider state the issue assumed. modelNotConfigured i18n key kept (still used by the chat path in use-chat-sessions). Tests: resolveSelectedModel + hasUsableLLMProvider unit tests; server-sync invariant tests (client-key resolution, atomic API-key entry, provider switch, serverModels-preference + first-load regression guards). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(settings): address code review — symmetric image/video config resolution Code review Important #1: fold atomic model resolution into setImageProviderConfig / setVideoProviderConfig (design doc Layer 1 step 2 scoped these). Deleting the selected custom image/video model on the active provider now resolves to a valid model in the same set() instead of leaving it stale until the next server sync — restores LLM/image/video symmetry. Minor #2/#3: document the deliberate store-level (credential path) vs UX-level (incl. ≥1 model) "usable" split on isProviderUsable; clarify that setProviderConfig also adopts providerId on the first-load API-key path. Tests: +2 invariant tests (delete selected custom image/video model → falls back to a valid model). 89/89 store tests pass, tsc clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(settings): keyless providers need explicit baseUrl to count as usable (#580) Code-review/e2e follow-up. ollama/lemonade ship requiresApiKey:false + a registry defaultBaseUrl, so the old isLLMProviderConfigured counted them as configured on a fresh install — but buildFallback/reconcile never auto-selects keyless providers, so hasUsableProvider could be true while modelId stayed '', re-enabling generation with no model (the exact #498/#580 class of bug, just via the keyless path). Narrow isLLMProviderConfigured to be consistent with isProviderUsable: keyless ⇒ usable only once the user sets an explicit baseUrl (registry defaultBaseUrl alone is not user intent); server-config and key-requiring paths unchanged. Now gate + toolbar + reconcile agree, and genuine State A (no usable provider) is reachable again. Add e2e/tests/model-invariant-580.spec.ts: State A (no provider → generate disabled + single "Set up model" affordance, no toast/forced dialog) and State B (server-configured → concrete model auto-resolved, generation enabled). +3 keyless unit tests. 92/92 store tests + 2 e2e pass, tsc clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(settings): re-resolve selection on bulk provider-config change (#580) Self-test found: deleting the selected provider in Settings left the model selector pointing at the deleted/invalid provider (or, via the component's hand-rolled fallback, at an unusable built-in like openai with no key). Root cause: setProvidersConfig (the bulk setter behind delete-provider / delete-model) did ZERO re-resolution — Layer 1 covered setProviderConfig (singular) + image/video setters + the reconcile, but not this path. The component's confirmDeleteProvider then picked Object.keys()[0] (always a built-in, ignoring usability), the exact per-call-site hand-rolled shape #580 condemns. Fix at the source: setProvidersConfig now re-resolves (providerId, modelId) via the canonical isLLMProviderConfigured + resolveSelectedModel — keep the current provider if still usable, else first usable, else State A. Removed the brittle confirmDeleteProvider override (kept only its local Settings-tab selection). Now every bulk config mutation (delete, import, reset) holds the invariant. Tests: +2 store tests (delete selected/only-usable provider → State A, not the deleted/invalid one; delete non-selected → keep usable selection). 94/94 store tests, full suite green (only pre-existing ssrf DNS), tsc/eslint clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(settings): one shared LLM-selection resolver for all config edits (#580) Clarified repro: the realistic case isn't deleting a custom provider — it's clearing the SELECTED provider's API key so it becomes invalid. setProvider Config only re-resolved when the edit made a provider *usable*; the "became unusable" transition fell through (return base) → the now-invalid provider stayed selected. The previous commit fixed only the bulk path. Root-cause fix: extract one shared `resolveLLMSelection(config, providerId, modelId)` — keep current provider if still usable, else first usable, else State A; then resolve the model. Use it in BOTH setProviderConfig (single edit: key add/clear, model-list edit) and setProvidersConfig (bulk: delete, import, reset). The two paths can no longer diverge — exactly the per-call- site asymmetry #580 set out to eliminate. Drops the now-unused isProviderUsable import. Tests: +2 store tests (clear selected provider key → drops stale selection; clear non-selected key → keeps usable selection). 96/96 store tests, full suite green except pre-existing ssrf DNS; tsc + eslint clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style: prettier-format #580 test specs CI 'Lint, Typecheck & Unit Tests' ran prettier . --check and failed on the two added test specs (only tsc/eslint/vitest were run locally). Pure whitespace reformat, no logic change — 96/96 store tests still green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix server-configured LLM startup so a provider selected by server sync also gets a usable default model when the current
modelIdis empty.Related Issues
Fixes #498
Changes
openaiprovider, emptymodelId, server response containingdeepseekwithdeepseek-chat.fetchwrapper sonext buildpasses with the current Next/TypeScript fetch type.Type of Change
Verification
Steps to reproduce / test
providerId: openaiand an emptymodelId./api/server-providersto return onlydeepseekwithmodels: ['deepseek-chat'].fetchServerProviders()and verify the store selectsproviderId: deepseekandmodelId: deepseek-chat.What you personally verified
pnpm vitest run tests/store/settings-server-sync.test.ts -t "selects the server LLM model"failed before the fix withexpected '' to be 'deepseek-chat'.pnpm vitest run tests/store/settings-server-sync.test.tspassed: 47 tests.pnpm testpassed: 45 test files, 334 tests.pnpm lint && pnpm check && pnpm buildpassed. Lint reports 7 existing warnings and 0 errors.pnpm exec playwright install chromiumwas run because the local Playwright browser cache was missing.CI=1 pnpm test:e2epassed: 11 Playwright tests. I used CI mode because a pre-existingnext devprocess held.next/dev/lockon port 3000; CI mode starts the production server on 3002 without touching that process.Evidence
pnpm lint && pnpm check && pnpm build)pnpm test)CI=1 pnpm test:e2e)Checklist