Skip to content

fix(agent): data sync no longer unions a reconnecting member's shared memory#1191

Closed
branarakic wants to merge 2 commits into
mainfrom
fix/swm-data-sync-union
Closed

fix(agent): data sync no longer unions a reconnecting member's shared memory#1191
branarakic wants to merge 2 commits into
mainfrom
fix/swm-data-sync-union

Conversation

@branarakic

Copy link
Copy Markdown
Contributor

What

Fixes a shared-memory corruption on reconnect: the general data sync over-accepted _shared_memory quads and blind-inserted them, so a reconnecting member that already held a value v1 and re-synced a graph now at v3 ended up holding the union {v1, v3} instead of v3.

Root cause

parseAndFilterNQuads (sync-verify-worker-impl.ts) accepted any quad under the CG URI prefix. For a data request that swept in the CG's _shared_memory graphs and the data path storeInsert-ed them additively (no per-subject REPLACE), unioning stale and fresh state.

Fix

Graph-gate the filter: a data request excludes _shared_memory graphs; a _shared_memory request is a no-op (reduces to the original filter), so per-KA SWM subgraph sync is unchanged. SWM convergence is owned by the SWM sync path (per-subject REPLACE), not the data path.

Scope / risk

One-file, ~19 lines. No-op for SWM requests (verified: a KA-published value survives reconnect); only data requests change, and only to stop sweeping SWM. Standalone — independent of the recovery/convergence work that consumes it.

🤖 Generated with Claude Code

… memory

parseAndFilterNQuads over-accepted _shared_memory quads under the CG prefix and
the data path blind-inserted them (no per-subject REPLACE), so a member holding
v1 that re-synced a graph now at v3 ended up with the union {v1,v3}. Graph-gate
the filter: data requests exclude _shared_memory; a _shared_memory request is a
no-op (per-KA SWM subgraph sync unchanged). SWM convergence stays owned by the
SWM sync path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
// cannot truncate a root spanning a page boundary. The literal "/_shared_memory"
// matches every SWM URI form and only those (validateSubGraphName forbids a
// leading "_" on subgraph names, so no data/meta/context graph can collide).
const swmRequest = graphUri.includes('/_shared_memory');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: includes('/_shared_memory') is broader than the comment implies. The repo already has non-live SWM graph shapes like .../_shared_memory_snapshots/.../_shared_memory, and this substring check will classify those as SWM too. That makes the new filter depend on an incidental substring instead of the actual root/meta/per-KA SWM URI shapes. Consider a shape-aware predicate here (exact /_shared_memory, exact /_shared_memory_meta, or the existing descendant helper for .../_shared_memory/{addr}/{n}) so future reserved graphs under the same prefix do not get pulled into this branch accidentally.

const swmRequest = graphUri.includes('/_shared_memory');
return {
quads: quads.filter((q) => q.graph === graphUri || q.graph.startsWith(cgUriPrefix)),
quads: quads.filter(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: this changes requester-side sync admission in a correctness-sensitive path, but the PR doesn't add regression coverage for the new boundary. A focused SyncVerifyWorker.parseAndFilter test with a mixed page (durable graphs plus _shared_memory graphs, including descendant .../_shared_memory/{addr}/{n} forms) would make it much safer to maintain and would catch accidental broadening/narrowing of this filter.

Export parseAndFilterNQuads and pin the union-corruption guard: a durable DATA/meta
sync request drops _shared_memory quads (so a streaming apply can't union a
reconnecting private-CG member into {old,new}), while a shared-memory request is a
no-op that keeps the base bucket + per-KA subgraphs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
// cannot truncate a root spanning a page boundary. The literal "/_shared_memory"
// matches every SWM URI form and only those (validateSubGraphName forbids a
// leading "_" on subgraph names, so no data/meta/context graph can collide).
const swmRequest = graphUri.includes('/_shared_memory');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: graphUri.includes('/_shared_memory') (and the matching check on line 254) also matches the context-graph ID prefix itself. validateContextGraphId currently allows IDs like 0xabc/_shared_memory-demo, so a durable sync for that CG would be misclassified as SWM and either keep SWM quads or drop legitimate sibling graphs incorrectly. Please detect SWM from the graph suffix/path after did:dkg:context-graph:${contextGraphId} instead of a raw substring match, and add a regression test for a CG id containing /_shared_memory.

@branarakic

Copy link
Copy Markdown
Contributor Author

Closing as superseded — the durable-sync-union bug this PR fixed is covered in main: the responder-side fix landed as d5e0b11b2 ("durable data sync must not serve _shared_memory (Gate A)"), and the requester-side parseAndFilter in sync-verify-worker-impl.ts was reworked in #1203 with shape-aware _shared_memory classification (anchored on the CG prefix + isSharedMemoryBucketDescendantDataGraph + meta-graph handling) — more developed than this branch's version. No unique unmerged work. Reopen if I've missed something.

@branarakic branarakic closed this Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant