feat(agent,cli): gate private-CG host-mode custody on participation (non-member cores hold zero ciphertext)#1190
feat(agent,cli): gate private-CG host-mode custody on participation (non-member cores hold zero ciphertext)#1190branarakic wants to merge 5 commits into
Conversation
…o private ciphertext A core that learns of a private CG via the chain-event / discovery-beacon auto-host path no longer custodies its SWM ciphertext: host-mode subscription is gated on isNodeParticipantOfCg (curator OR local-meta member OR on-chain participant; id-shape-robust, positive-memoized). Members backfill from the curator (REPLACE-recovery), so a third-party core holds nothing. - dkg-agent-cg-resolve: isNodeParticipantOfCg helper; dkg-agent-base: participantCgIds memo. - dkg-agent-swm-host: decline host-mode subscribe for non-participants on private CGs. - swmHostMode.stripNonParticipants flag (default on) — rollout kill-switch + A/B baseline. - cli: plumb the swmHostMode config block through to the agent (was inert before, so the kill-switch had no effect via config.json). - unit test (7/7) + devnet harness: side-by-side baseline (strip-off core hosts, Δ2) vs strip (strip-on core holds zero, Δ0), member backfill, and convergence with both bystander cores absent. Scope: SWM half only. VM-payload private ciphertext still reaches cores (M5/M6). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nk/public checks Gate ingestSwmHostModeEnvelope (host-mode .meta surface) and ingestSwmCiphertextChunkEnvelope (LU-11 chunk surface) on isNodeParticipantOfCg so a non-participant core holds zero private ciphertext even if a stray re-flood or a restored persisted subscription wires the handler (the transitional / rolling-upgrade case the Path-2 subscribe decline doesn't cover on its own). Both are curated-only paths, so public CGs are untouched. Harness: assert the chunk-store surface is also empty on the strip node (SWM writes go via agent.share which never chunks — the chunk store is the VM-payload M6 surface — so this corroborates zero leak on both surfaces), a public-CG no-op gate (the strip gate never fires for accessPolicy=0), and a wire-hash DECLINE log match. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| * isn't wrongly kept out of its own custody. Keyed by whatever id form the | ||
| * caller passes (cleartext or wire-hash) — both can map to the same true. | ||
| */ | ||
| protected readonly participantCgIds = new Set<string>(); |
There was a problem hiding this comment.
🔴 Bug: This positive-only cache is not actually safe for the lifetime of the process. The repo already supports removing/revoking participants (remove-participant, revokedAgents), so once a node resolves true here it will keep passing isNodeParticipantOfCg() for the rest of the session and continue ingesting/serving private SWM after access was revoked. Either invalidate this set on CG ACL/curator updates or avoid memoizing positive answers until that invalidation path exists.
| // ON; `stripNonParticipants:false` restores legacy auto-host (kill-switch / | ||
| // A/B baseline). | ||
| const stripNonParticipants = this.config.swmHostMode?.stripNonParticipants ?? true; | ||
| if (stripNonParticipants && !(await this.isNodeParticipantOfCg(contextGraphId))) { |
There was a problem hiding this comment.
🔴 Bug: Declining here only prevents a new auto-subscribe; it does not unwind an existing one. On nodes that already hosted this CG before upgrade / before stripNonParticipants was enabled, initializeSwmHostModeStore() restores the persisted host-mode subscription first, and this early return leaves that handler plus any stored ciphertext in place. The node can still serve those envelopes via host catchup, so the strip does not actually enforce zero custody. Consider explicitly unwireSwmHostModeHandler() and clearing/purging stored host-mode state when a reconciled CG is now non-participant.
| // direct call wired this handler. The Path-2 subscribe decline is the | ||
| // primary gate; this closes the residual ingest surface. | ||
| const stripNonParticipants = this.config.swmHostMode?.stripNonParticipants ?? true; | ||
| if (stripNonParticipants && !(await this.isNodeParticipantOfCg(storageCgId))) { |
There was a problem hiding this comment.
🔴 Bug: This defense-in-depth drop ignores how the handler was wired, so it also fires for enableSwmHostModeFor() subscriptions. That breaks the documented manual override: the API can report host-mode enabled, but every incoming envelope (and the matching LU-11 chunk path below) is discarded. If manual host-mode is meant to remain an explicit override, preserve the subscription source and bypass this gate for SUBSCRIPTION_SOURCES.MANUAL.
The Path-2 subscribe decline is the proven core of the strip (a non-participant core never wires the host handler, so it never ingests on either surface — devnet G-strip Δ0). The extra participant checks inside ingestSwmHostModeEnvelope and ingestSwmCiphertextChunkEnvelope were defence-in-depth for the transitional / direct-call case, but they changed behaviour that existing wiring tests pin (lu11-chunk-catchup-wiring asserts the ingester persists), and they aren't needed for the SWM claim: SWM writes go via agent.share which never chunks, so the chunk store is the VM-payload (M6) surface, not SWM. Defer those gates to the M6 follow-up where the affected tests are updated deliberately. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| * negative is re-evaluated so a freshly-joined member self-heals. | ||
| */ | ||
| async isNodeParticipantOfCg(this: DKGAgent, contextGraphId: string): Promise<boolean> { | ||
| if (this.participantCgIds.has(contextGraphId)) return true; |
There was a problem hiding this comment.
🔴 Bug: This positive-only memo never gets invalidated, but participation can shrink during a session. For example, removeAgentFromContextGraph() tombstones the agent and re-queues SWM subscription reconciliation; after this change isNodeParticipantOfCg() will still short-circuit to true, so a revoked core can keep or re-enable host-mode custody for that private CG. Please either clear this cache on membership/owner revocation changes or stop memoizing true across the whole process lifetime.
…(Codex #1190) The persisted-subscription restore path (initializeSwmHostModeStore) re-wired the host-mode handler directly, bypassing the live subscribe gate — so on a rolling upgrade a core that hosted a private CG under the old 'any core auto-hosts' behaviour would re-engage custody on restart, defeating zero-custody for already- hosting nodes. Gate the restore on isNodeParticipantOfCg (same check as the subscribe path) and shed the stale persisted record via enqueueHostModePersistence so it stops re-arming each boot. Closes the transitional/upgrade hole. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| * negative is re-evaluated so a freshly-joined member self-heals. | ||
| */ | ||
| async isNodeParticipantOfCg(this: DKGAgent, contextGraphId: string): Promise<boolean> { | ||
| if (this.participantCgIds.has(contextGraphId)) return true; |
There was a problem hiding this comment.
🔴 Bug: This positive-only cache never gets invalidated when the local node is removed from a private CG. reconcileSharedMemoryGossipSubscription() can re-run after a participant revocation, but once a CG id lands here isNodeParticipantOfCg() will keep returning true and the core will continue host-mode custody until restart. Either avoid memoizing positives, or clear this entry whenever membership / revokedAgents state changes.
| // it stops re-arming each boot. (This is the upgrade path the live | ||
| // subscribe gate can't reach, since restore wires the handler directly.) | ||
| const stripNonParticipants = this.config.swmHostMode?.stripNonParticipants ?? true; | ||
| if (stripNonParticipants && !(await this.isNodeParticipantOfCg(cgId))) { |
There was a problem hiding this comment.
🔴 Bug: This restore-time participant gate runs before startup rehydrates subscribedContextGraphs/wireIdToLocalCgId (initializeSwmHostModeStore() happens before rehydrateContextGraphSubscriptions()). Persisted hash-form host-mode entries therefore have no way to resolve back to their on-chain id yet, so isNodeParticipantOfCg(cgId) can false-negative and enqueueHostModePersistence(cgId, false) deletes a valid subscription on every restart. Defer the strip until after rehydration, or restore against a cleartext/canonical id that can be resolved at this point.
Revert the restore-path participant gate too: like the ingest gates it changes restart/init behaviour that existing tests pin (swm-sender-key-pending-by-agent 'loads persisted pending rows after restart'). The live subscribe gate (reconcileSwmHostModeSubscription) is the devnet-proven core (G-strip Δ0); the transitional/upgrade hardening (restore-path shed + ingest early-returns) moves to a dedicated follow-up PR that updates the affected restart/ingest tests deliberately. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Closing as superseded — |
What
A core node that is not a participant (curator or member) of a private context graph no longer custodies that CG's live shared-memory (SWM) ciphertext. Today host-mode (LU-6) custody is core-only and auto-engages for any curated CG a core discovers via the chain-event / discovery-beacon path — so any third-party core ends up holding ciphertext for CGs it has no role in. This gates that on participation: a non-participant core declines, holding zero ciphertext. Members back-fill from the curator (REPLACE-recovery), which is already the intended private-CG catch-up path.
How
isNodeParticipantOfCg(cgId)(new,dkg-agent-cg-resolve.ts) — true iff this node is the curator OR a local-_metamember OR an on-chain participant. ID-shape-robust (cleartext vs wire-hash) and positive-memoized (participantCgIds); a genuine participant resolves locally, so it never over-blocks its own custody.dkg-agent-swm-host.ts, all curated-only (they sit after the existingif (!curated) return, so public CGs are untouched):reconcileSwmHostModeSubscription— the primary gate: decline the auto-host subscribe (this alone starves both ingest paths, which the same handler wires).ingestSwmHostModeEnvelope— defensive: the host-mode.metasurface.ingestSwmCiphertextChunkEnvelope— defensive: the LU-11 chunk surface.swmHostMode.stripNonParticipants(default on) — rollout kill-switch + A/B test control.swmHostModeconfig block was never forwardedconfig.json → agent(DKGAgent.createomitted it), so the block was inert. Now plumbed (lifecycle.ts+DkgConfigtype).Validation
host-mode-participant-gate.test.ts— 7/7 (curator/member ⇒ host; bystander ⇒ decline; reconcile wires vs declines).scripts/devnet-test-swm-strip.sh, 4 cores): side-by-side baseline node (strip-off) hosts the CG (Δ2.meta) vs strip node (strip-on) holds zero (Δ0) — the discriminator that makes "zero" meaningful; the member converges from the curator; and it still converges with both bystander cores stopped (curator is the sole holder).Scope
This is the live SWM ciphertext. The VM-payload chunked ciphertext (published KA chunks) is a separate surface (
agent.sharenever chunks — confirmed) and is follow-up work; the defensive Path-3 ingest gate already covers it. Operationally this depends on the curator-recovery back-fill (separate change) being present.🤖 Generated with Claude Code