Skip to content

import-whatsapp: auto-default-identity skipped when summary.Errors > 0 #307

@jesserobbins

Description

@jesserobbins

Context

Claude found this bug while reviewing PR #304 — specifically while tracing the auto-default-identity wiring across importers during the roborev cycles. It's pre-existing on main: the summary.Errors == 0 gate predates #304 and lives outside the per-account-identities/collections scope. Conferred with Jesse and we agreed to file it separately rather than expand the surface area of #304.

The bug hasn't been reproduced against a live WhatsApp export — what follows is a code-path read.

What happens

runImportWhatsApp in cmd/msgvault/cmd/import.go gates the auto-default-identity write on summary.Errors == 0:

if !noDefaultIdentityImportWhatsApp && summary.Errors == 0 && summary.SourceID != 0 {
    confirmDefaultIdentity(s, summary.SourceID, importPhone, importPhone, "phone-e164")
}

summary.Errors counts per-message recoverable errors — a malformed line in a chat export, a missing media file, a parse warning. The import as a whole still succeeds and returns error == nil. The user sees "Import complete" and N successful messages in the database.

But because summary.Errors > 0, the auto-default identity row never gets written. The source exists; the identity does not. Dedup's sent-copy detection is silently degraded until the user runs msgvault identity add manually.

The other importers don't share this gate:

  • import_emlx.go / import_mbox.go gate on a hard-error class, not the per-message counter.
  • import_imessage.go / import_gvoice.go don't gate on errors at all.

The WhatsApp path is the strictest, and it's the one whose export format is most likely to surface non-zero summary.Errors on real-world inputs.

Why it matters

Real WhatsApp exports almost always have non-zero summary.Errors: WhatsApp's chat-text format is fragile, media references can be broken, attachment file matching is best-effort. A clean import is the exception.

So the auto-default-identity feature — which is supposed to make first-time setup just work — silently does nothing on common real-world WhatsApp imports. There's no error message; the identity row simply doesn't appear.

Repro shape

  1. WhatsApp chat export with at least one missing media reference (or any other recoverable parse warning).
  2. msgvault import-whatsapp <phone> <export.zip>.
  3. Import succeeds; summary.Errors > 0.
  4. msgvault identity list --account <phone> shows no rows.

Expected: identity row present with source_signal = account-identifier.

Proposed fix

Drop the summary.Errors == 0 check, matching the best-effort pattern used by import-imessage / import-gvoice:

if !noDefaultIdentityImportWhatsApp && summary.SourceID != 0 {
    confirmDefaultIdentity(s, summary.SourceID, importPhone, importPhone, "phone-e164")
}

If a future hard-error concept is added to WhatsApp summaries, switch to that gate (matching emlx/mbox) instead of the per-message Errors counter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions