Skip to content

account_identities: document the 'always compare via NormalizeIdentifierForCompare' invariant on looksLikeEmail #315

@wesm

Description

@wesm

Context

Surfaced during the design re-read of PR #304. Not a current bug — is_from_me is read off the message row, so no caller is currently doing raw string equality on identity addresses. Filing as a one-line invariant comment because the trap is exactly the kind of thing the next contributor will trip on.

The invariant

internal/store/account_identities.go:19-32 defines looksLikeEmail and the comparison rule:

Emails are matched case-insensitively in the identity store; other identifier shapes (phone E.164, Matrix MXIDs like @user:server.org, Slack/IRC handles) preserve case.

The comparison happens in two places:

  1. SQL: identifierMatch.WhereClause emits LOWER(address) = LOWER(?) for email-shaped values, address = ? otherwise.
  2. Go: EqualIdentifier and NormalizeIdentifierForCompare mirror the SQL behavior.

The dedup engine uses NormalizeIdentifierForCompare in internal/dedup/dedup.go for sent-detection. Commit 1dc25be (dedup+collection: preserve From-case in dedup) added a FromEmail field to DuplicateMessageRow that preserves original case for display, and routes through the normalizer at compare time.

The trap: a future caller pulling FromEmail (or any identity string) off a row and using == for comparison will get a different answer than the engine did. The bug surface only opens up when a new comparison site is added.

Why it matters

  • The invariant is only documented inline at the call sites.
  • The looksLikeEmail function is reusable enough that the next contributor adding a comparison may grab it without realizing the SQL/Go split exists.
  • A subtle silent-mismatch bug (case-mismatched From: headers vs. confirmed identity) is exactly the kind of thing that would degrade dedup sent-copy detection without surfacing as a test failure.

Proposed approach

One of:

  1. Doc comment on looksLikeEmail — add a Compare via NormalizeIdentifierForCompare; never via raw string equality sentence that future readers can't miss.
  2. Constructor type — wrap identifier-bearing values in a small Identifier newtype whose comparison method enforces normalization. Heavier; would touch every call site.
  3. Linter rule — disallow == against fields named *Email or values returned from looksLikeEmail. Probably overkill.

Recommend (1) for the immediate fix; (2) becomes worthwhile if a third call site needs the comparison.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions