Skip to content

feat: add Brave Search & Baidu Search integration with sub-source support#42

Merged
wyuc merged 3 commits into
THU-MAIC:mainfrom
YizukiAme:feat/brave-baidu-search
May 13, 2026
Merged

feat: add Brave Search & Baidu Search integration with sub-source support#42
wyuc merged 3 commits into
THU-MAIC:mainfrom
YizukiAme:feat/brave-baidu-search

Conversation

@YizukiAme
Copy link
Copy Markdown
Contributor

Summary

Adds Brave Search and Baidu Search as web search providers, with Baidu sub-source configuration (Web Search, Baike, Scholar).

Changes

New Search Providers

  • Brave Search: Uses public page scraping — no API key required. Parses Brave's Svelte-rendered HTML to extract search results.
  • Baidu Search: Integrates Web Search, Baike (encyclopedia), and Scholar APIs. Users can toggle individual sub-sources on/off.

Provider Icons

  • Added icons for Brave (public/icons/brave.png), Baidu (public/icons/baidu.png), and Tavily (public/icons/tavily.jpg)
  • Icons displayed in toolbar pill, provider dropdown, and settings sidebar

UI Improvements

  • Baidu sub-source toggles use standard Switch components (consistent with rest of the UI)
  • Brave API key field shown but marked optional with explanatory notice
  • Search toggle enabled when any provider is available (not just selected one)

Technical Details

  • lib/web-search/brave.ts: Regex-based HTML parser for Brave's current Svelte-rendered page structure
  • lib/web-search/baidu.ts: Accepts subSources parameter, queries enabled sources in parallel
  • lib/store/settings.ts: Added baiduSubSources state with per-source toggles
  • app/api/web-search/route.ts: Brave bypasses API key check; Baidu passes sub-source config
  • components/generation/generation-toolbar.tsx: Fixed webSearchAvailable logic

Files Changed

File Description
lib/web-search/brave.ts Fixed HTML parser for current Brave Search structure
lib/web-search/baidu.ts Added sub-source toggle support
lib/web-search/types.ts Added BaiduSubSources interface
lib/web-search/constants.ts Added sub-source config, icon paths
lib/store/settings.ts Added Baidu sub-source state
app/api/web-search/route.ts Brave no-key logic, Baidu sub-sources
app/generation-preview/page.tsx Pass sub-sources in API call
components/settings/web-search-settings.tsx Sub-source toggles, optional key UI
components/generation/generation-toolbar.tsx Icons, fixed availability logic
public/icons/* Provider icons

Testing

  • ✅ Baidu Search returns results (verified via API call with key)
  • ✅ Brave Search parser matches current HTML structure (tested with saved HTML, 20 results parsed)
  • ✅ Build passes (pnpm build exit code 0)
  • ✅ UI verified: icons display correctly, search toggle enables properly, Baidu sub-source switches render consistently

Copy link
Copy Markdown
Contributor

wyuc commented May 5, 2026

Thanks for the contribution, and sorry for the long delay on my side.

This PR is now quite far behind main and conflicts with the current web-search architecture, so it is not mergeable as-is. There are also a few areas that would need another look after an update, including how the new providers plug into the current searchWeb flow, server-side classroom generation, Baidu sub-source handling, and test coverage for the Brave parser.

If you’re still interested in maintaining this PR, please rebase it onto the latest main and adapt it to the current web-search provider architecture. I’ll be happy to review it again after that.

@YizukiAme
Copy link
Copy Markdown
Contributor Author

Got it! I'll rebase this onto main and update the code to fit the current architecture in a couple of days when I have some bandwidth.
Thanks for the review~

@YizukiAme YizukiAme force-pushed the feat/brave-baidu-search branch 2 times, most recently from 4e08f30 to 463e5ec Compare May 8, 2026 12:32
@YizukiAme
Copy link
Copy Markdown
Contributor Author

image image

Done~

@YizukiAme YizukiAme force-pushed the feat/brave-baidu-search branch from 463e5ec to be0a610 Compare May 10, 2026 12:45
@wyuc
Copy link
Copy Markdown
Contributor

wyuc commented May 13, 2026

Reviewed PR #42 and tested the new web search providers locally.

Overall this looks good for Brave and for Baidu Web/Baike. Brave official API mode worked in a real generation flow with Gemini 3 Flash, and Baidu Web/Baike also return usable results with the configured key.

A few issues I’d fix before merge:

  1. Baidu Scholar response parsing does not match the official schema.
    In lib/web-search/baidu.ts, the code currently checks data.code !== 0 and reads data.results, but the official Baidu Scholar response uses string code "0" and returns papers under data.
    This means Scholar results will parse as empty or be treated as failed even after the key is properly authorized.

  2. Scholar abstract parameter should use enable_ai_abstract.
    The docs’ parameter table names this field enable_ai_abstract; the docs example is inconsistent, but using the parameter-table name is safer.

  3. .env.example is missing BRAVE_API_KEY.
    The implementation supports Brave official API mode via server config, but the example env only documents Baidu/Tavily/Bocha. Please add BRAVE_API_KEY= so deployers can discover the intended env var.

  4. Tests should clear Brave env vars.
    tests/server/web-search-config.test.ts and tests/web-search/route.test.ts reset Tavily/Bocha/Baidu env vars but not BRAVE_API_KEY / BRAVE_BASE_URL, so local or CI env can leak into tests.

One note on Scholar availability: the URL itself appears correct. Opening it without auth returns Baidu’s get authorization error, so the route is real. With the current key, Scholar returned 404, which looks like a Baidu-side authorization/enablement issue for that tool rather than a URL issue. The UI should probably indicate that Baidu Scholar may require separate authorization, or at least fail visibly enough that users don’t assume the toggle is working when it returns no sources.

Tested:

  • pnpm exec tsc --noEmit --pretty false
  • targeted web-search/provider tests
  • pnpm lint
  • pnpm build
  • pnpm test:e2e
  • real Brave + Gemini 3 Flash first-page generation with screenshots

@YizukiAme YizukiAme force-pushed the feat/brave-baidu-search branch from be0a610 to 29f62e5 Compare May 13, 2026 10:24
@YizukiAme
Copy link
Copy Markdown
Contributor Author

Thanks so much for the thorough review!! All points are confirmed — working on it 👍

@YizukiAme YizukiAme force-pushed the feat/brave-baidu-search branch from 29f62e5 to 5b61acf Compare May 13, 2026 11:32
YizukiAme added 3 commits May 13, 2026 19:33
…through

- Brave Search now supports dual mode: official JSON API (with key) or
  HTML scraping fallback (without key)
- Fix API key always being discarded for optional-key providers in both
  route.ts and resolveClassroomWebSearchConfig
- Add BRAVE to WEB_SEARCH_ENV_MAP for server-side key resolution
- Add api.search.brave.com to SSRF whitelist
- Update i18n: clarify API key is optional for scrapable providers
- Update settings UI placeholder for optional-key providers
- Remove stale 'no API key' env hint for Brave
@YizukiAme YizukiAme force-pushed the feat/brave-baidu-search branch from 5b61acf to 5d88c72 Compare May 13, 2026 11:33
@YizukiAme
Copy link
Copy Markdown
Contributor Author

YizukiAme commented May 13, 2026

All 4 items addressed in 5b61acf ~ 😊

1. Baidu Scholar response parsing

  • code type: numberstring (matches "0")
  • Results array: resultsdata
  • BaiduScholarPaper interface aligned to official schema (aiAbstract, doi, paperId, publishInfo, publishYear as number)
  • aiAbstract now included in content pipeline

2. Scholar parameter name

  • enable_abstractenable_ai_abstract (following the parameter table, not the inconsistent curl example)

3. .env.example missing BRAVE_API_KEY

  • Added BRAVE_API_KEY=

4. Tests should clear Brave env vars

  • Added delete process.env.BRAVE_API_KEY / BRAVE_BASE_URL in both web-search-config.test.ts and route.test.ts beforeEach

Bonus: Scholar authorization UX

  • Added "API Docs ↗" links to all three Baidu sub-sources (Web Search, Baike, Scholar) in the settings UI, pointing to official Baidu documentation. This helps users discover free tier limits (Scholar: 50 calls/day) and authorization requirements. Fully i18n'd across all 6 locales.

Copy link
Copy Markdown
Contributor

@wyuc wyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Re-ran focused tests and E2E, including real Brave Search + Gemini first-page generation on 5d88c72.

@wyuc wyuc merged commit 47cc2a5 into THU-MAIC:main May 13, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants