nodetask: add MarkReady SeiNodeTask kind#387
Conversation
Adds a thin SeiNodeTask kind that dispatches the already-registered seictl
sidecar `mark-ready` task (sidecar.MarkReadyTask{}, fire-and-forget). Empty
payload, no target-derived params — mirrors the DiscoverPeers kind's spec shape.
Motivation: in sidecar mode the sidecar's readiness flag is in-process and lost
on every pod restart; seid's start-gate and the proxy readiness probe both sit
on /v0/healthz, which serves 200 only after mark-ready. Today the node controller
re-marks readiness on its ~30s reapproval poll. MarkReady lets a Chaos-Mesh
Workflow re-mark explicitly after a readiness-blind restart
(UpdateNodeImage -> replace-pod -> MarkReady -> AwaitNodesAtHeight).
Completion is the submit ack (fire-and-forget): a submit failure Fails rather
than false-completing, and the empty no-op handler makes the ready flip
guaranteed. Gate "node is serving" with a following AwaitCondition/
AwaitNodesAtHeight step.
requirePhase=Running is deadlock-safe: SeiNode.status.phase is only written by
setTargetPhase (always Running for started nodes), so MarkReady dispatches during
the parked/unready window.
Also sweeps the SeiNodeTask doc comments to state what the code does rather than
what it doesn't (RestartPod completion, FailureReason, DiscoverPeers sequencing,
populateOutputs deferral).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR SummaryMedium Risk Overview The nodetask controller wires the kind through Also includes doc-comment clarifications on Reviewed by Cursor Bugbot for commit 03da60f. Bugbot is set up for automated code reviews on this repo. Configure here. |
What
Adds a
MarkReadySeiNodeTask kind — a thin wrapper dispatching the already-registered seictl sidecarmark-readytask (sidecar.MarkReadyTask{}, fire-and-forget). Empty payload, no target-derived params; mirrors the DiscoverPeers kind's spec shape.Why
In sidecar mode the sidecar's readiness flag is an in-process
atomic.Bool, lost on every pod restart. Both seid's start-gate and the RBAC-proxy readiness probe sit on/v0/healthz, which serves 200 only aftermark-ready. Today only the node controller re-marks readiness, and it notices on its ~30s reapproval poll — so a restarted node sits parked (seid not started, not signing) until that poll fires.MarkReadylets a Chaos-Mesh Workflow re-mark explicitly after a readiness-blind restart, collapsing that gap:(
replace-podcompletes without waiting for Ready, so a followingMarkReadyruns while seid is parked and unblocks it.)Semantics
Failsrather than false-completing; the empty no-op sidecar handler makes the ready flip guaranteed. Gate "node is serving" with a followingAwaitCondition/AwaitNodesAtHeightstep —Completemeans "request accepted," a beat before/v0/healthzserves 200.SeiNode.status.phaseis only written bysetTargetPhase(alwaysRunningfor started nodes), never downgraded on readiness loss, soMarkReadydispatches during the parked/unready window.Also
Sweeps the SeiNodeTask doc comments to state what the code does rather than what it doesn't (RestartPod completion,
FailureReason, DiscoverPeers sequencing,populateOutputsdeferral).Test
SeiNodeTaskParamsFormapsMarkReady→(TaskTypeMarkReady, MarkReadyTask{}); nil payload → reasonedParamsBuildFailed.MarkReadyrequiresmarkReady; the exactly-one-payload union rejectsMarkReady+second payload.TestReconcile_MarkReady_EndToEndpins fire-and-forget: Complete at the submit reconcile,getCalls == 0(no poll).make manifests generate,make test,make test-integration,golangci-lint --new-from-rev=origin/main→ 0.🤖 Generated with Claude Code