Conversation
Add dampened negative factor that penalizes entity pairs whose functional neighbors have no matching counterpart, preventing false merges between entities with identical names but different structural contexts (e.g. two different "Dr. James Chen" in unrelated research labs). Key design decisions: - Weighted average mismatch (not per-path product) to distinguish partial vs full neighborhood mismatch - Binary relation similarity threshold (0.8) to prevent synonym cascade - max(neg_a, neg_b) bilateral factor (charitable for incomplete coverage) - Name-similarity seed for inner match check (prevents circular reinforcement) - Separate positive_base tracking for monotone structural propagation Removes xfail markers from test_identical_names_different_contexts_no_merge and test_similar_names_disjoint_neighborhoods_no_match. Closes #23 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Reviewed the full diff and source files. No issues found — approving.
What I checked:
- Negative factor correctness:
name_seedis fixed (prevents circular reinforcement),positive_baseis monotone non-decreasing, and sincenegis constant per pair (computed from fixedname_seed),combined = base * negis monotone after the initial drop. Theabs(combined - old) > epsilonchange correctly handles that initial drop. - Forward vs inverse functionality swap between
_build_weighted_adjacencyand_build_forward_adjacencyis correct per the PARIS design innegative_evidence.md. max(neg_a, neg_b)bilateral strategy is well-motivated for asymmetric news coverage.- Test changes: xfail removals correct, NovaTech assertion relaxed with clear justification, monotonicity test correctly isolates positive propagation via
neg_alpha=0.0, deadanchorvariable and unusedimport pytestcleaned up. - No dead code, no shims, no half-finished pieces. Reads as if negative evidence was always part of the design.
LGTM.
monneyboi
left a comment
There was a problem hiding this comment.
The negative evidence implementation reintroduces a binary relation similarity threshold (rel_threshold=0.8 in _one_sided_negative), but positive propagation still uses continuous rel_sim as a multiplier. This creates an inconsistent mental model: functionality pooling and negative evidence treat relations as "same or not" via a threshold, while positive propagation treats similarity as a continuous weight.
Requested change: Unify to a single rel_threshold parameter on match_graphs that flows to all three consumers:
compute_functionality— pool relations above threshold (already does this)propagate_similarity(positive) — gate relation pairs: ifrel_sim >= threshold, treat as same relation and propagate (without multiplying byrel_sim); below threshold, skip entirely. This reverts the "initial nongated" change from695408b.compute_negative_factor— check relation match using the same threshold (already does this, but hardcoded)
The mental model becomes: the threshold defines equivalence classes over free-text relations — essentially reducing them to a discrete enum. Above threshold = same relation, below = different. One threshold, one concept, used everywhere.
This means:
- Remove the separate
rel_thresholdparam fromcompute_negative_factor/_one_sided_negative - Add a
rel_thresholdparam topropagate_similarity - In the propagation loop, change
strength_sum += rs * weight * neighbor_confidenceto juststrength_sum += weight * neighbor_confidence(only reached whenrs >= rel_threshold) match_graphspasses the singlerel_threshold(currentlyrel_cluster_threshold) to all three functions
Replace the inconsistent mix of continuous rel_sim multiplier (positive propagation) and binary threshold (functionality pooling, negative evidence) with a single rel_threshold that defines relation equivalence classes throughout: above threshold = same relation, below = different. - Add rel_threshold param to propagate_similarity, gate relation pairs (skip if below threshold, no continuous multiplier) - Remove hardcoded default from compute_negative_factor/_one_sided_negative, receive threshold from caller via keyword-only param - Wire rel_cluster_threshold from match_graphs to all three consumers - Strengthen three-source name variation test: add 2 more shared neighbors to compensate for reduced structural signal from gating (3 name variants reduce functionality weights to 1/3, need 5 neighbors for sufficient evidence) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Reviewed the full diff (both commits) against match.py, the design doc, and the prior review. The threshold unification from monneyboi's feedback is fully addressed. No issues found.
What I checked:
- Threshold unification complete:
rel_cluster_thresholdflows frommatch_graphs→propagate_similarity→compute_negative_factor/_one_sided_negative. Positive propagation gates withrs < rel_thresholdand drops the continuousrs *multiplier. One concept of relation equivalence throughout — functionality pooling, positive gating, negative evidence all use the same threshold. - Convergence is guaranteed despite
abs()change:positive_baseis monotone non-decreasing, and the negative factor is computed fromname_seed(fixed), not propagated confidence. Socombined = base * negconverges oncebasedoes — no oscillation possible. Theabs(combined - old) > epsiloncorrectly detects the initial drop when negative evidence first activates. - Forward vs inverse functionality swap between
_build_weighted_adjacencyand_build_forward_adjacencyis correct per PARIS semantics: positive evidence uses inverse functionality (target match → source match), negative evidence uses forward functionality (source should have specific target). max(neg_a, neg_b)bilateral strategy is well-motivated — asymmetric news coverage means missing neighbors on one side shouldn't compound.- NovaTech assertion relaxation is correct: with one neighbor (founder) that doesn't match and forward functionality ~1.0,
score = 1.0 × 0.7 = 0.7 < 0.8is the right behavior — identical names with contradicting functional neighbors should be suppressed. The test's primary assertion (sharma ≠ vasquez) still holds. - No dead code, no shims, no half-finished pieces. Clean integration.
LGTM — no issues to flag.
Summary
test_identical_names_different_contexts_no_mergeandtest_similar_names_disjoint_neighborhoods_no_match; all 54 tests passCloses #23
Test plan
uv run pytest)test_identical_names_different_contexts_no_merge— two different "Dr. James Chen" in unrelated labs no longer mergetest_similar_names_disjoint_neighborhoods_no_match— "Elena Vasquez" / "Lena Vasquez" with disjoint neighborhoods no longer match🤖 Generated with Claude Code