Skip to content

nodetask: add MarkReady SeiNodeTask kind#387

Merged
bdchatham merged 1 commit into
mainfrom
feat/seinodetask-mark-ready
Jun 7, 2026
Merged

nodetask: add MarkReady SeiNodeTask kind#387
bdchatham merged 1 commit into
mainfrom
feat/seinodetask-mark-ready

Conversation

@bdchatham

Copy link
Copy Markdown
Collaborator

What

Adds a MarkReady SeiNodeTask kind — a thin wrapper dispatching the already-registered seictl sidecar mark-ready task (sidecar.MarkReadyTask{}, fire-and-forget). Empty payload, no target-derived params; mirrors the DiscoverPeers kind's spec shape.

Why

In sidecar mode the sidecar's readiness flag is an in-process atomic.Bool, lost on every pod restart. Both seid's start-gate and the RBAC-proxy readiness probe sit on /v0/healthz, which serves 200 only after mark-ready. Today only the node controller re-marks readiness, and it notices on its ~30s reapproval poll — so a restarted node sits parked (seid not started, not signing) until that poll fires.

MarkReady lets a Chaos-Mesh Workflow re-mark explicitly after a readiness-blind restart, collapsing that gap:

UpdateNodeImage → replace-pod → MarkReady → AwaitNodesAtHeight

(replace-pod completes without waiting for Ready, so a following MarkReady runs while seid is parked and unblocks it.)

Semantics

  • Completion = submit ack (fire-and-forget). A submit failure Fails rather than false-completing; the empty no-op sidecar handler makes the ready flip guaranteed. Gate "node is serving" with a following AwaitCondition/AwaitNodesAtHeight step — Complete means "request accepted," a beat before /v0/healthz serves 200.
  • requirePhase=Running is deadlock-safe: SeiNode.status.phase is only written by setTargetPhase (always Running for started nodes), never downgraded on readiness loss, so MarkReady dispatches during the parked/unready window.

Also

Sweeps the SeiNodeTask doc comments to state what the code does rather than what it doesn't (RestartPod completion, FailureReason, DiscoverPeers sequencing, populateOutputs deferral).

Test

  • SeiNodeTaskParamsFor maps MarkReady(TaskTypeMarkReady, MarkReadyTask{}); nil payload → reasoned ParamsBuildFailed.
  • envtest CEL: MarkReady requires markReady; the exactly-one-payload union rejects MarkReady+second payload.
  • TestReconcile_MarkReady_EndToEnd pins fire-and-forget: Complete at the submit reconcile, getCalls == 0 (no poll).
  • make manifests generate, make test, make test-integration, golangci-lint --new-from-rev=origin/main → 0.

🤖 Generated with Claude Code

Adds a thin SeiNodeTask kind that dispatches the already-registered seictl
sidecar `mark-ready` task (sidecar.MarkReadyTask{}, fire-and-forget). Empty
payload, no target-derived params — mirrors the DiscoverPeers kind's spec shape.

Motivation: in sidecar mode the sidecar's readiness flag is in-process and lost
on every pod restart; seid's start-gate and the proxy readiness probe both sit
on /v0/healthz, which serves 200 only after mark-ready. Today the node controller
re-marks readiness on its ~30s reapproval poll. MarkReady lets a Chaos-Mesh
Workflow re-mark explicitly after a readiness-blind restart
(UpdateNodeImage -> replace-pod -> MarkReady -> AwaitNodesAtHeight).

Completion is the submit ack (fire-and-forget): a submit failure Fails rather
than false-completing, and the empty no-op handler makes the ready flip
guaranteed. Gate "node is serving" with a following AwaitCondition/
AwaitNodesAtHeight step.

requirePhase=Running is deadlock-safe: SeiNode.status.phase is only written by
setTargetPhase (always Running for started nodes), so MarkReady dispatches during
the parked/unready window.

Also sweeps the SeiNodeTask doc comments to state what the code does rather than
what it doesn't (RestartPod completion, FailureReason, DiscoverPeers sequencing,
populateOutputs deferral).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cursor

cursor Bot commented Jun 7, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
New v1alpha1 CRD enum and controller path affect node readiness after restarts; mis-timed MarkReady in a workflow could unblock seid before follow-up gates, though semantics are documented and tested.

Overview
Adds a MarkReady SeiNodeTask kind so workflows can explicitly re-mark sidecar readiness after restarts or rollouts, instead of waiting on the node controller’s reapproval poll. The spec uses an empty MarkReadyPayload, CEL union rules include markReady, and generated CRD/manifests/deepcopy are updated accordingly.

The nodetask controller wires the kind through SeiNodeTaskParamsFor → sidecar mark-ready (TaskTypeMarkReady, empty payload), with a 2 minute default execution timeout when timeoutSeconds is unset. Completion follows the existing fire-and-forget sidecar contract: Complete on submit ack, not after GetTask polling (covered by reconcile tests).

Also includes doc-comment clarifications on RestartPod, DiscoverPeers, and related param/error handling; no behavioral change beyond wording for those paths.

Reviewed by Cursor Bugbot for commit 03da60f. Bugbot is set up for automated code reviews on this repo. Configure here.

@bdchatham bdchatham merged commit a4f2368 into main Jun 7, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant