Skip to content

fix(resolve): prevent crashes in _merge_entities during multi-cluster runs#1

Open
cosmic-fire-eng wants to merge 1 commit into
johncarpenter:mainfrom
cosmic-fire-eng:fix/merge-entities-stale-survivor
Open

fix(resolve): prevent crashes in _merge_entities during multi-cluster runs#1
cosmic-fire-eng wants to merge 1 commit into
johncarpenter:mainfrom
cosmic-fire-eng:fix/merge-entities-stale-survivor

Conversation

@cosmic-fire-eng

Copy link
Copy Markdown

During a single run_resolution pass the mentions list is read once up front. Later clusters can name a survivor_id an earlier cluster already merged away, producing two crashes in _merge_entities: (1) an AttributeError when the survivor-row SELECT returns None, and (2) a sqlite3.IntegrityError when re-pointing relations onto the survivor violates the unique index on (subject_id, predicate, object_id, evidence_chunk_id) (the canonical-name UPDATE can likewise hit UNIQUE(canonical_name, entity_type)).

Fix, all inside _merge_entities: return early if the survivor row is gone (it was already resolved into another survivor); use UPDATE OR IGNORE on the relation re-point and the canonical-name update so a pre-existing duplicate doesn't abort the transaction; then delete relations still pointing at dropped ids so nothing dangles.

Verified on a real corpus: 12,895 mentions resolved into 6,520 entities with 2,360 merges; before the fix the run crashed at the first stale-id collision.

… runs

During a single run_resolution pass the mentions list is read once up
front. Later clusters can name a survivor_id an earlier cluster already
merged away, producing two crashes in _merge_entities: an AttributeError
when the survivor-row SELECT returns None, and a sqlite3.IntegrityError
when re-pointing relations onto the survivor violates the unique index on
(subject_id, predicate, object_id, evidence_chunk_id).

Return early if the survivor row is gone; use UPDATE OR IGNORE on the
relation re-point and the canonical-name update; then delete relations
still pointing at dropped ids so nothing dangles. Adds regression tests
for both the stale-survivor and UNIQUE-collision paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented Jun 19, 2026

Copy link
Copy Markdown

Greptile Summary

This PR fixes two crash-level bugs in _merge_entities that surface during multi-cluster run_resolution passes: an AttributeError when a named survivor was already deleted by an earlier cluster, and a sqlite3.IntegrityError from a UNIQUE-constraint violation when re-pointing relations onto a survivor that already shares the same tuple.

  • Stale-survivor guard: adds a SELECT before any mutation; returns early if the survivor row is missing, preventing the None["attributes"] crash.
  • OR IGNORE on relation re-point and canonical-name update: gracefully skips already-duplicate rows; a follow-up DELETE then removes any relations still referencing dropped IDs, preventing dangling FK references.
  • Residual issue: total_merges in run_resolution is incremented unconditionally after _merge_entities returns, so stale-survivor early-exits inflate the reported merge count stored in resolution_runs.

Confidence Score: 4/5

The crash fixes are correct and well-tested; the two new tests cover both failure paths described in the PR. The merge-count overcounting is cosmetic and does not affect graph correctness.

The core changes — early-exit guard, OR IGNORE on relations and canonical name, orphan DELETE — correctly eliminate the crashes described in the PR and are backed by targeted unit tests. One unaddressed bug exists: run_resolution increments total_merges after every _merge_entities call regardless of whether the early-return path was taken, so the count stored in resolution_runs will overstate actual merges whenever a stale-survivor collision occurs. There is also a known data-quality trade-off where drop_ids entities are silently left unresolved when the early return fires, though subsequent runs will catch them.

kgmd/resolve.py — specifically the total_merges increment at line 116 and the early-return behavior in _merge_entities.

Comments Outside Diff (1)

  1. kgmd/resolve.py, line 115-116 (link)

    P2 Merge count inflated when survivor is stale

    total_merges += len(merge_ids) - 1 is unconditional, so every time _merge_entities returns early (stale survivor), the reported merge count is inflated by the number of drop_ids that were silently skipped. With many stale-survivor collisions in a large corpus the resolution_runs record and the returned dict will significantly overstate how many entities were actually merged. The fix is to return the number of merges performed (0 on early-return, len(drop_ids) otherwise) and add that to total_merges instead of always adding len(merge_ids) - 1.

    Fix in Claude Code

Fix All in Claude Code

Reviews (1): Last reviewed commit: "fix(resolve): prevent crashes in _merge_..." | Re-trigger Greptile

Comment thread kgmd/resolve.py
Comment on lines +247 to +251
survivor_row = conn.execute(
"SELECT attributes FROM entities WHERE id = ?", (survivor_id,)
).fetchone()
if survivor_row is None:
return

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Drop-ids left unresolved on stale-survivor early return

When the early return fires, the drop_ids entities remain in the graph as unmerged duplicates — their entity rows, mentions, and relations are untouched. The PR description explicitly acknowledges this trade-off ("entity was already resolved into another survivor"), but in practice the dropped entities should have been folded into whatever entity ultimately absorbed survivor_id. Over multiple run_resolution calls the same drop candidates will keep surfacing as a cluster, so subsequent runs will likely catch them, but within a single pass the result is silent data-quality degradation rather than a true merge.

Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant