Skip to content

[bug] lore_appraise RRF surfaces off-topic results on shared-keyword BM25 hits #92

@kunallanjewar

Description

@kunallanjewar

Summary

When lore_appraise runs with the RRF fusion arm active, results that match a single common keyword via BM25 can dominate the fused ranking even when their semantic score is weak. The output then surfaces entries that are textually adjacent but topically irrelevant to the query.

Affected files

  • internal/lore/appraise.go / appraise_cmd.go
  • internal/lore/embed_wiring.go (RRF integration point)

Reproduction

Query lore_appraise with a query containing a common keyword that several unrelated entries also mention (e.g. "the database"); observe that top results include hits whose semantic score is materially lower than the BM25 score would suggest.

Acceptance

  • RRF weighting rebalances so semantic-weak hits do not dominate purely on a shared common keyword.
  • Existing appraise tests pass; new test covers the regression case.
  • Document the chosen weighting heuristic in the appraise comment header.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: loreLore knowledge archivebugSomething isn't workingsize: M< 200 lines

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions