Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new Rust-based Kubernetes controller (“etcdetcetc”) that manages etcd multi-tenancy via CRDs, provisioning per-tenant etcd users/roles and prefix-scoped permissions, plus local developer tooling/manifests to run it against a kind cluster.
Changes:
- Introduces
EtcdClusterandEtcdTenantCRDs and controllers (status via Kubernetes-style conditions, tenant finalizer cleanup, etcd client caching). - Adds dev workflows:
xtask dev-up/dev-down, Tilt integration, and a dev deployment manifest. - Adds packaging/docs: Dockerfile for a distroless image, README + design doc.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/etcdetcetc/xtask/src/main.rs | Adds xtask CLI to create/delete a kind cluster, export kubeconfig, and seed secrets/CRs for local dev. |
| apps/etcdetcetc/xtask/Cargo.toml | Defines the xtask crate and dependencies. |
| apps/etcdetcetc/Tiltfile | Adds Tilt build + YAML wiring for local iteration. |
| apps/etcdetcetc/src/tenant.rs | Implements the EtcdTenant controller: finalizer, password secret, etcd RBAC provisioning, output Secret/ConfigMap, Ready condition. |
| apps/etcdetcetc/src/main.rs | Wires CLI (crds) and runs both controllers concurrently with signal-based shutdown. |
| apps/etcdetcetc/src/etcd.rs | Adds helper to build an authenticated etcd client from Secret data (TLS or basic auth). |
| apps/etcdetcetc/src/crd.rs | Defines CRDs + shared status/conditions types and Ready condition helper. |
| apps/etcdetcetc/src/cluster.rs | Implements the EtcdCluster controller: secret watch mapping, client caching, health/status reporting, ConfigMap emission. |
| apps/etcdetcetc/README.md | Adds user-facing overview and CRD usage examples. |
| apps/etcdetcetc/Dockerfile | Builds and packages the controller into a distroless runtime image. |
| apps/etcdetcetc/doc/design.md | Adds design/behavior documentation for CRDs and reconciliation flow. |
| apps/etcdetcetc/dev/deploy.yaml | Adds a (dev) ServiceAccount/ClusterRole/Binding and Deployment manifest. |
| apps/etcdetcetc/Cargo.toml | Introduces the workspace + controller crate dependencies. |
| apps/etcdetcetc/Cargo.lock | Locks Rust dependency graph for the new workspace. |
| apps/etcdetcetc/.gitignore | Ignores build artifacts and local dev state. |
| apps/etcdetcetc/.dockerignore | Excludes dev/build directories from Docker context. |
| apps/etcdetcetc/.cargo/config.toml | Adds cargo xtask alias. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
EtcdCluster controller: persistent cached connections with 15s health check requeue, status vitals (version, db size, leader, members, alarms), config hash change detection, allowedNamespaces filtering. EtcdTenant controller: etcd user/role/permission provisioning with prefix-scoped RBAC, random password generation with GitOps-friendly pre-creation support, output Secret with connection details, finalizer cleanup on deletion. Both controllers use standard Kubernetes conditions for kubectl wait support. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
Alpine/musl Dockerfile with BuildKit cache mounts for cargo registry and target directory. Dev environment with kind cluster xtask, Tiltfile, and Kubernetes deployment manifests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
EtcdCluster now produces a shared ConfigMap with comma-separated endpoints and ca.crt. EtcdTenant output Secret is credentials-only. Tenant controller watches owned Secrets and ConfigMaps for drift. New allowedNamespaces field on EtcdClusterSpec switches between cluster-wide and per-namespace K8s RBAC. Each tenant mirrors the cluster ConfigMap into its own namespace. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
Replace hand-rolled Gregorian date arithmetic in rfc3339_now() with chrono::Utc. Add TODO on user_change_password noting the best-effort semantics and intent to improve when non-mTLS support is solidified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
- ensure_config_mirror: propagate not-ready when source ConfigMap is missing instead of silently marking tenant ready - update_ready_status: skip no-op status patches when condition is unchanged, matching the cluster controller's behavior - ensure_password_secret: use server-side apply instead of bare create to avoid 409 race on rapid reconciles - Add comment explaining cluster-wide RBAC subject intent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
- Replace cluster-wide ClusterRole+ClusterRoleBinding with namespace- scoped Role+RoleBinding to prevent cross-namespace secret exposure - Fix namespaced RBAC naming to include tenant namespace, preventing collisions between same-named tenants in different namespaces - Update design doc and README to reflect current implementation (password auth, ConfigMap outputs) - Fix is_not_found_error comment accuracy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
The RBAC approach was fundamentally wrong — creating Roles in consumer namespaces grants access to secrets in those namespaces, not in the tenant namespace where the secret actually lives. For the primary use case (pod volume mounts), no cross-namespace RBAC is needed since pods mount secrets from their own namespace. Consumer Helm charts should handle their own RBAC as cheap YAML rather than controller machinery. Removes ensure_k8s_rbac, ensure_namespaced_k8s_rbac, cleanup_namespaced_k8s_rbac, cleanup_k8s_rbac, tenant_rbac_labels, is_tenant_rbac_object, and the allowedNamespaces CRD field. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
Fix stale secret_refs index entries on EtcdCluster deletion, switch finalizer add/remove to JSON patch to avoid clobbering concurrent finalizers, restrict tls.key permissions in xtask dev-up, remove unused RBAC rules from dev deploy, and update design doc status references. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
…horization
Etcd user/role names and key prefixes are now deterministically computed
as {namespace}-{name} to prevent collisions across namespaces and hijacking
of reserved names like "root". The user-configurable prefix field is removed.
Cross-namespace cluster references are now authorized against the
EtcdCluster's allowedNamespaces field. Empty means same-namespace only;
["*"] allows all namespaces.
The mirrored ConfigMap now includes the computed prefix so consumers know
their assigned keyspace.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Automates the dev environment setup so opening a Codespace drops into a ready-to-iterate environment with kind, tilt, and a warm cargo cache. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
Replace docker_build with custom_build using native cargo compilation and crane for OCI image packaging. This avoids Docker for builds entirely, lets cargo incremental compilation work across rebuilds, and means the Codespaces prebuild cache is directly reused by Tilt. - Add local registry container + kind config for registry mirror - Build debug musl-static binaries on Alpine base for dev - Install crane + musl-tools in devcontainer prebuild Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
Tilt runs `cargo run -p etcdetcetc -- crds` on the host target to generate CRDs at startup. Build the full workspace natively (not just xtask) so this is a cache hit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
Run cargo xtask dev-up + tilt ci during onCreateCommand to validate the full dev pipeline and bake Docker images (kindest/node, registry:2) into the prebuild template. Codespace starts skip image pulls entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
Starlark doesn't support implicit string concatenation, so add explicit + operators in Tiltfile. Set CC=gcc for musl target builds outside the devcontainer where musl-tools aren't installed. Update kind-config.yaml and xtask to use containerd v2's certs.d host-based registry config instead of the deprecated v1 mirror format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sam Day <me@samcday.com>
After Docker-in-Docker restarts (prebuild -> codespace start), container IPs can shuffle. The etcd server cert was minted for the original IP but dev-up discovers the new one, causing TLS verification to fail. Fix by using a headless Service + Endpoints named after the control-plane container, so the etcd endpoint uses a DNS name that matches the stable DnsName SAN in the etcd server cert. Also pin k9s v0.50.18 in on-create.sh and add the sshd devcontainer feature so gh cs ssh works. Signed-off-by: Sam Day <me@samcday.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 23ebc752f0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| #[derive(Clone, Debug, Deserialize, Serialize, JsonSchema, PartialEq)] | ||
| #[serde(rename_all = "camelCase")] | ||
| pub struct Condition { | ||
| pub type_: String, |
There was a problem hiding this comment.
Use Kubernetes-compatible
type key for conditions
The Condition struct stores the field as type_ but does not rename it to type, so status JSON will use type_ while the CRD print columns in this same file filter on .status.conditions[?(@.type=="Ready")]. That mismatch means the Ready condition is not discoverable via the configured JSONPath (and kubectl wait --for=condition=Ready semantics tied to type), so both EtcdCluster and EtcdTenant can appear not-ready even after successful reconciliation.
Useful? React with 👍 / 👎.
| if let Err(err) = auth.user_change_password(name, password).await { | ||
| warn!(name, error = %err, "failed to sync etcd user password"); | ||
| } |
There was a problem hiding this comment.
Fail reconcile when etcd password sync fails
This path only logs user_change_password failures and continues reconciliation, which later writes credentials and marks the tenant Ready. In basic-auth deployments (explicitly supported by this controller), that creates a split-brain credential state where Kubernetes Secrets contain a password that etcd never accepted, so tenant clients receive authentication failures until an operator manually repairs the user password.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Acceptable for now. Username/password support is incomplete as we're focused on mTLS path.
Also known as E-T-C. It's an etcd tenant controller. Today, it handles the last-mile part of carving up an etcd keyspace into tenants. e.g tenant-a will get a corresponding user+role, which will be allowed to r/w
/tenant-a/**.Tomorrow, it will handle migration of tenants between clusters. Not fully defined yet but imagine some kind of co-ordinating (cluster-level)
EtcdTenantMigrationwith source/targetEtcdClusterrefs.etcdetcetcwould handle replicating the keyspace (and reporting on that via.statuson the ETM).This is a continuation of #39, wanted to reset the PR conversation history as there was a lot of rounds there xD
I want to land this in a moment, then move it to its own repo and give it a sane devcontainer.json