Skip to content

etcdetcetc#40

Merged
samcday merged 17 commits into
mainfrom
e-t-c
Apr 6, 2026
Merged

etcdetcetc#40
samcday merged 17 commits into
mainfrom
e-t-c

Conversation

@samcday
Copy link
Copy Markdown
Owner

@samcday samcday commented Apr 4, 2026

Also known as E-T-C. It's an etcd tenant controller. Today, it handles the last-mile part of carving up an etcd keyspace into tenants. e.g tenant-a will get a corresponding user+role, which will be allowed to r/w /tenant-a/**.

Tomorrow, it will handle migration of tenants between clusters. Not fully defined yet but imagine some kind of co-ordinating (cluster-level) EtcdTenantMigration with source/target EtcdCluster refs. etcdetcetc would handle replicating the keyspace (and reporting on that via .status on the ETM).

This is a continuation of #39, wanted to reset the PR conversation history as there was a lot of rounds there xD

I want to land this in a moment, then move it to its own repo and give it a sane devcontainer.json

Copilot AI review requested due to automatic review settings April 4, 2026 02:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Rust-based Kubernetes controller (“etcdetcetc”) that manages etcd multi-tenancy via CRDs, provisioning per-tenant etcd users/roles and prefix-scoped permissions, plus local developer tooling/manifests to run it against a kind cluster.

Changes:

  • Introduces EtcdCluster and EtcdTenant CRDs and controllers (status via Kubernetes-style conditions, tenant finalizer cleanup, etcd client caching).
  • Adds dev workflows: xtask dev-up/dev-down, Tilt integration, and a dev deployment manifest.
  • Adds packaging/docs: Dockerfile for a distroless image, README + design doc.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
apps/etcdetcetc/xtask/src/main.rs Adds xtask CLI to create/delete a kind cluster, export kubeconfig, and seed secrets/CRs for local dev.
apps/etcdetcetc/xtask/Cargo.toml Defines the xtask crate and dependencies.
apps/etcdetcetc/Tiltfile Adds Tilt build + YAML wiring for local iteration.
apps/etcdetcetc/src/tenant.rs Implements the EtcdTenant controller: finalizer, password secret, etcd RBAC provisioning, output Secret/ConfigMap, Ready condition.
apps/etcdetcetc/src/main.rs Wires CLI (crds) and runs both controllers concurrently with signal-based shutdown.
apps/etcdetcetc/src/etcd.rs Adds helper to build an authenticated etcd client from Secret data (TLS or basic auth).
apps/etcdetcetc/src/crd.rs Defines CRDs + shared status/conditions types and Ready condition helper.
apps/etcdetcetc/src/cluster.rs Implements the EtcdCluster controller: secret watch mapping, client caching, health/status reporting, ConfigMap emission.
apps/etcdetcetc/README.md Adds user-facing overview and CRD usage examples.
apps/etcdetcetc/Dockerfile Builds and packages the controller into a distroless runtime image.
apps/etcdetcetc/doc/design.md Adds design/behavior documentation for CRDs and reconciliation flow.
apps/etcdetcetc/dev/deploy.yaml Adds a (dev) ServiceAccount/ClusterRole/Binding and Deployment manifest.
apps/etcdetcetc/Cargo.toml Introduces the workspace + controller crate dependencies.
apps/etcdetcetc/Cargo.lock Locks Rust dependency graph for the new workspace.
apps/etcdetcetc/.gitignore Ignores build artifacts and local dev state.
apps/etcdetcetc/.dockerignore Excludes dev/build directories from Docker context.
apps/etcdetcetc/.cargo/config.toml Adds cargo xtask alias.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apps/etcdetcetc/src/cluster.rs
Comment thread apps/etcdetcetc/src/tenant.rs Outdated
Comment thread apps/etcdetcetc/src/tenant.rs Outdated
Comment thread apps/etcdetcetc/xtask/src/main.rs
Comment thread apps/etcdetcetc/dev/deploy.yaml Outdated
Comment thread apps/etcdetcetc/doc/design.md
@samcday samcday changed the title ectcdetcetc etcdetcetc Apr 4, 2026
samcday and others added 13 commits April 5, 2026 15:04
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
EtcdCluster controller: persistent cached connections with 15s health
check requeue, status vitals (version, db size, leader, members,
alarms), config hash change detection, allowedNamespaces filtering.

EtcdTenant controller: etcd user/role/permission provisioning with
prefix-scoped RBAC, random password generation with GitOps-friendly
pre-creation support, output Secret with connection details, finalizer
cleanup on deletion.

Both controllers use standard Kubernetes conditions for kubectl wait
support.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Alpine/musl Dockerfile with BuildKit cache mounts for cargo registry
and target directory. Dev environment with kind cluster xtask, Tiltfile,
and Kubernetes deployment manifests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
EtcdCluster now produces a shared ConfigMap with comma-separated
endpoints and ca.crt. EtcdTenant output Secret is credentials-only.
Tenant controller watches owned Secrets and ConfigMaps for drift.
New allowedNamespaces field on EtcdClusterSpec switches between
cluster-wide and per-namespace K8s RBAC. Each tenant mirrors the
cluster ConfigMap into its own namespace.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Replace hand-rolled Gregorian date arithmetic in rfc3339_now() with
chrono::Utc. Add TODO on user_change_password noting the best-effort
semantics and intent to improve when non-mTLS support is solidified.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
- ensure_config_mirror: propagate not-ready when source ConfigMap is
  missing instead of silently marking tenant ready
- update_ready_status: skip no-op status patches when condition is
  unchanged, matching the cluster controller's behavior
- ensure_password_secret: use server-side apply instead of bare create
  to avoid 409 race on rapid reconciles
- Add comment explaining cluster-wide RBAC subject intent

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
- Replace cluster-wide ClusterRole+ClusterRoleBinding with namespace-
  scoped Role+RoleBinding to prevent cross-namespace secret exposure
- Fix namespaced RBAC naming to include tenant namespace, preventing
  collisions between same-named tenants in different namespaces
- Update design doc and README to reflect current implementation
  (password auth, ConfigMap outputs)
- Fix is_not_found_error comment accuracy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
The RBAC approach was fundamentally wrong — creating Roles in consumer
namespaces grants access to secrets in those namespaces, not in the
tenant namespace where the secret actually lives. For the primary use
case (pod volume mounts), no cross-namespace RBAC is needed since
pods mount secrets from their own namespace. Consumer Helm charts
should handle their own RBAC as cheap YAML rather than controller
machinery.

Removes ensure_k8s_rbac, ensure_namespaced_k8s_rbac,
cleanup_namespaced_k8s_rbac, cleanup_k8s_rbac, tenant_rbac_labels,
is_tenant_rbac_object, and the allowedNamespaces CRD field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Fix stale secret_refs index entries on EtcdCluster deletion, switch
finalizer add/remove to JSON patch to avoid clobbering concurrent
finalizers, restrict tls.key permissions in xtask dev-up, remove unused
RBAC rules from dev deploy, and update design doc status references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
…horization

Etcd user/role names and key prefixes are now deterministically computed
as {namespace}-{name} to prevent collisions across namespaces and hijacking
of reserved names like "root". The user-configurable prefix field is removed.

Cross-namespace cluster references are now authorized against the
EtcdCluster's allowedNamespaces field. Empty means same-namespace only;
["*"] allows all namespaces.

The mirrored ConfigMap now includes the computed prefix so consumers know
their assigned keyspace.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Automates the dev environment setup so opening a Codespace drops into
a ready-to-iterate environment with kind, tilt, and a warm cargo cache.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Replace docker_build with custom_build using native cargo compilation
and crane for OCI image packaging. This avoids Docker for builds
entirely, lets cargo incremental compilation work across rebuilds, and
means the Codespaces prebuild cache is directly reused by Tilt.

- Add local registry container + kind config for registry mirror
- Build debug musl-static binaries on Alpine base for dev
- Install crane + musl-tools in devcontainer prebuild

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
samcday and others added 4 commits April 5, 2026 15:09
Tilt runs `cargo run -p etcdetcetc -- crds` on the host target to
generate CRDs at startup. Build the full workspace natively (not just
xtask) so this is a cache hit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Run cargo xtask dev-up + tilt ci during onCreateCommand to validate the
full dev pipeline and bake Docker images (kindest/node, registry:2) into
the prebuild template. Codespace starts skip image pulls entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
Starlark doesn't support implicit string concatenation, so add explicit
+ operators in Tiltfile. Set CC=gcc for musl target builds outside the
devcontainer where musl-tools aren't installed. Update kind-config.yaml
and xtask to use containerd v2's certs.d host-based registry config
instead of the deprecated v1 mirror format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sam Day <me@samcday.com>
After Docker-in-Docker restarts (prebuild -> codespace start), container
IPs can shuffle. The etcd server cert was minted for the original IP but
dev-up discovers the new one, causing TLS verification to fail. Fix by
using a headless Service + Endpoints named after the control-plane
container, so the etcd endpoint uses a DNS name that matches the stable
DnsName SAN in the etcd server cert.

Also pin k9s v0.50.18 in on-create.sh and add the sshd devcontainer
feature so gh cs ssh works.

Signed-off-by: Sam Day <me@samcday.com>
@samcday
Copy link
Copy Markdown
Owner Author

samcday commented Apr 6, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23ebc752f0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema, PartialEq)]
#[serde(rename_all = "camelCase")]
pub struct Condition {
pub type_: String,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use Kubernetes-compatible type key for conditions

The Condition struct stores the field as type_ but does not rename it to type, so status JSON will use type_ while the CRD print columns in this same file filter on .status.conditions[?(@.type=="Ready")]. That mismatch means the Ready condition is not discoverable via the configured JSONPath (and kubectl wait --for=condition=Ready semantics tied to type), so both EtcdCluster and EtcdTenant can appear not-ready even after successful reconciliation.

Useful? React with 👍 / 👎.

Comment on lines +311 to +313
if let Err(err) = auth.user_change_password(name, password).await {
warn!(name, error = %err, "failed to sync etcd user password");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail reconcile when etcd password sync fails

This path only logs user_change_password failures and continues reconciliation, which later writes credentials and marks the tenant Ready. In basic-auth deployments (explicitly supported by this controller), that creates a split-brain credential state where Kubernetes Secrets contain a password that etcd never accepted, so tenant clients receive authentication failures until an operator manually repairs the user password.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acceptable for now. Username/password support is incomplete as we're focused on mTLS path.

@samcday samcday merged commit 1f824ea into main Apr 6, 2026
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants