store: weight SQLite FTS5 bm25 to mirror PostgreSQL setweight ranking by webgress · Pull Request #337 · kenn-io/msgvault

webgress · 2026-05-24T02:47:36Z

This is PR4c of what was going to be 4 PRs. The main larger PR4 for PostgreSQL capability is pending PR3 merge.
This one is smaller and independent.

Summary

SQLiteDialect.FTSSearchClause now orders results by bm25(messages_fts, 1.0, 10.0, 1.0, 4.0, 1.0, 1.0) so subject matches outrank
sender matches outrank body/recipient matches — matching what PostgreSQL already does with setweight 'A'=1.0 / 'B'=0.4 / 'D'=0.1.
Adds two cross-backend tests: TestFTSRankWeightsAcrossBackends (single-token rank attribution) and TestFTSRankParityFixture
(multi-query fixture suite that logs ordering per backend).
Updates docs/PG_STATUS.md to mark FTS rank ordering as partially resolved (intra-class tie-breaking still differs because bm25 and
ts_rank are different scorer functions).

roborev-ci · 2026-05-24T02:50:47Z

roborev: Combined Review (`47fcbf8`)

High: PR has a build-breaking import path issue in new tests.

High

internal/store/fts_rank_parity_test.go L8-L9 and internal/store/fts_rank_test.go L7-L8
The new test files import packages using go.kenn.io/msgvault, but this repository’s go.mod module path is github.com/wesm/msgvault. This will cause build/test failures in the current repo.

Fix: Update the imports in both test files to use github.com/wesm/msgvault.

No Medium or Critical findings were reported.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

webgress · 2026-05-24T03:01:16Z

roborev: Combined Review (47fcbf8)

High: PR has a build-breaking import path issue in new tests.

High

internal/store/fts_rank_parity_test.go L8-L9 and internal/store/fts_rank_test.go L7-L8
The new test files import packages using go.kenn.io/msgvault, but this repository’s go.mod module path is github.com/wesm/msgvault. This will cause build/test failures in the current repo.
Fix: Update the imports in both test files to use github.com/wesm/msgvault.

No Medium or Critical findings were reported.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

This "high" finding is a false positive — it claims go.mod is
github.com/wesm/msgvault, but PR #336 (commit eabce62, merged
2026-05-22) renamed the module to go.kenn.io/msgvault. This PR's
base (a3e6038) is one commit after that rename; new test files
match the current module path and pass in CI.

The failing test check is a pre-existing govulncheck failure on
golang.org/x/net v0.54.0 (5 CVEs fixed in v0.55.0). The same check
is red on main since 2026-05-22 — not caused by this PR. PR #328
already bumps to v0.55.0; this check goes green here automatically
once #328 lands.

SearchMessages previously ranked SQLite results with the default bm25 (every column equal) while PostgreSQL applies setweight 'A' to subject and 'B' to sender. The two backends returned the same rows in different orders. SQLiteDialect.FTSSearchClause now orders by bm25(messages_fts, 1.0, 10.0, 1.0, 4.0, 1.0, 1.0) positional over every declared FTS5 column (the leading slot is the UNINDEXED message_id). The 10:4:1 ratio across subject/sender/body matches PG's 'A'=1.0 / 'B'=0.4 / 'D'=0.1, so subject-only matches outrank sender-only matches, which outrank body-only matches on both backends. bm25 and ts_rank remain different scoring functions, so intra-class tie-breaking can still diverge — documented as a known difference in PG_STATUS.md. Adds TestFTSRankWeightsAcrossBackends, a cross-backend test that seeds the search token in exactly one FTS column per row and asserts the subject > sender > body ordering on whichever backend NewTestStore resolves to. Additional commits squashed: * store: add multi-query FTS rank parity fixture test * store: adapt FTS rank parity test import path for upstream module path Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-05-25T02:28:04Z

roborev: Combined Review (`265a6e7`)

PR needs a small test import fix before merge.

Medium

internal/store/fts_rank_test.go:6
New test imports use go.kenn.io/msgvault/..., but this repo’s go.mod and existing imports use github.com/wesm/msgvault/.... This can break local test compilation or make the new tests inconsistent with the module path.
Fix the new imports in fts_rank_test.go and fts_rank_parity_test.go to use:
- github.com/wesm/msgvault/internal/store
- github.com/wesm/msgvault/internal/testutil

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

wesm · 2026-05-25T03:33:05Z

false positive

wesm force-pushed the pr4c-upstream branch from 47fcbf8 to 265a6e7 Compare May 25, 2026 02:25

wesm merged commit 7600457 into kenn-io:main May 25, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

store: weight SQLite FTS5 bm25 to mirror PostgreSQL setweight ranking#337

store: weight SQLite FTS5 bm25 to mirror PostgreSQL setweight ranking#337
wesm merged 1 commit into
kenn-io:mainfrom
webgress:pr4c-upstream

webgress commented May 24, 2026

Uh oh!

roborev-ci Bot commented May 24, 2026

Uh oh!

webgress commented May 24, 2026

roborev: Combined Review (`47fcbf8`)

High

Uh oh!

roborev-ci Bot commented May 25, 2026

Uh oh!

wesm commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

webgress commented May 24, 2026

Summary

Uh oh!

roborev-ci Bot commented May 24, 2026

roborev: Combined Review (47fcbf8)

High

Uh oh!

webgress commented May 24, 2026

roborev: Combined Review (47fcbf8)

High

Uh oh!

roborev-ci Bot commented May 25, 2026

roborev: Combined Review (265a6e7)

Medium

Uh oh!

wesm commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

roborev: Combined Review (`47fcbf8`)

roborev: Combined Review (`47fcbf8`)

roborev: Combined Review (`265a6e7`)