feat: add client caching to reduce OAuth token requests#64
feat: add client caching to reduce OAuth token requests#64irby wants to merge 7 commits intorelease-2.5from
Conversation
2.5.0: CA Bundle with ConfigMap + GKE Ambient Credentials Documentation Co-authored-by: Matthew H. Irby <irby@users.noreply.github.com>
Previously, every certificate request reconciliation created a new Command API client, which meant a new OAuth token was fetched for each request. For customers with OAuth provider quotas, this caused rate limiting issues. This change introduces a ClientCache that: - Caches Command API clients by configuration hash - Reuses cached clients across reconciliations for the same issuer - Allows the underlying oauth2 library's token caching to work as intended - Is thread-safe for concurrent reconciliations The cache key is a SHA-256 hash of all configuration fields that affect the client connection (hostname, API path, credentials, scopes, etc.), ensuring different issuers get different clients while the same issuer reuses its client. Fixes: OAuth token re-authentication on every request Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Matthew H. Irby <matt.irby@keyfactor.com>
Signed-off-by: Matthew H. Irby <matt.irby@keyfactor.com>
There was a problem hiding this comment.
Pull request overview
This PR adds OAuth client caching to avoid repeated re-authentication, extends CA bundle support to allow sourcing trust bundles from either Secrets or ConfigMaps (optionally selecting a specific key), and expands operational/docs support for e2e workflows and ambient identity providers.
Changes:
- Add a thread-safe in-memory Command client cache keyed by configuration hash and wire it into controllers.
- Add Issuer/ClusterIssuer fields for CA bundle ConfigMap support (
caBundleConfigMapName) and key selection (caBundleKey), plus RBAC/Helm/CRD updates. - Improve e2e scripts/docs and add new documentation for CA bundle management and GKE ambient credentials.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/controller/issuer_controller_test.go | Adds unit tests for commandConfigFromIssuer CA bundle/auth parsing. |
| internal/controller/issuer_controller.go | Adds ConfigMap CA bundle retrieval + caBundleKey selection logic. |
| internal/command/client_cache_test.go | Adds tests for config hashing determinism and basic cache operations. |
| internal/command/client_cache.go | Introduces ClientCache and config hashing for cached signers/health checkers. |
| internal/command/client.go | Adds a timeout when requesting Azure tokens to fall back faster to other ambient sources. |
| go.sum | Bumps keyfactor-auth-client-go checksum entries to v1.3.1. |
| go.mod | Updates keyfactor-auth-client-go to v1.3.1 and adjusts Go version/toolchain directives. |
| e2e/run_tests.sh | Adds CA Secret/ConfigMap generation, new CA tests, and makes execution more configurable (context, image/chart overrides). |
| e2e/certs/.gitkeep | Adds placeholder directory for CA cert material used by e2e tests. |
| e2e/README.md | Updates e2e requirements/config/docs and documents new env vars and workflows. |
| e2e/.gitignore | Ignores local .env and generated cert contents while keeping .gitkeep. |
| e2e/.env.example | Adds DISABLE_CA_CHECK and normalizes env example formatting. |
| docsource/content.md | Updates installation docs and links to new CA bundle docs. |
| docs/ca-bundle/README.md | New CA bundle documentation (Secret/ConfigMap + trust-manager guidance). |
| docs/ambient-providers/google.md | New documentation for GKE workload identity ambient credentials. |
| deploy/charts/command-cert-manager-issuer/values.yaml | Adds Helm value for ConfigMap RBAC mode and env passthrough. |
| deploy/charts/command-cert-manager-issuer/templates/rolebinding.yaml | Adds RoleBinding/ClusterRoleBinding for ConfigMap read access. |
| deploy/charts/command-cert-manager-issuer/templates/role.yaml | Adds Role/ClusterRole for ConfigMap read access. |
| deploy/charts/command-cert-manager-issuer/templates/deployment.yaml | Adds CLI flag for configmap access scope and supports custom env vars. |
| deploy/charts/command-cert-manager-issuer/templates/crds/issuers.yaml | Adds CRD schema fields/descriptions for caBundleConfigMapName/caBundleKey. |
| deploy/charts/command-cert-manager-issuer/templates/crds/clusterissuers.yaml | Adds CRD schema fields/descriptions for caBundleConfigMapName/caBundleKey. |
| deploy/charts/command-cert-manager-issuer/README.md | Documents Helm values (updated table). |
| config/crd/bases/command-issuer.keyfactor.com_issuers.yaml | Updates generated CRD base with new CA bundle fields. |
| config/crd/bases/command-issuer.keyfactor.com_clusterissuers.yaml | Updates generated CRD base with new CA bundle fields. |
| cmd/main.go | Adds configmap access flag, cache scoping by object, and wires shared ClientCache into reconcilers. |
| api/v1alpha1/issuer_types.go | Adds IssuerSpec fields for caBundleConfigMapName and caBundleKey and updates comments. |
| README.md | Updates install docs, adds GKE doc link, and points CA bundle section to new docs. |
| Makefile | Changes test-e2e target to run the bash-based e2e script. |
| Dockerfile | Minor formatting change. |
| CHANGELOG.md | Adds v2.5.0 notes for new CA bundle features, caching, timeouts, docs updates. |
Comments suppressed due to low confidence (1)
e2e/run_tests.sh:649
- The
kubectl waitfailure check uses command substitution inside[[ ... ]], which checks the output string rather than the command exit code. Ifkubectl waitfails but prints an error message, this condition may not trigger and the script could continue as if the wait succeeded. Prefer checking the exit status directly (e.g.,if ! kubectl wait ...; then ...).
if [[ ! $(kubectl wait --for=condition=Ready certificaterequest/$CR_CR_NAME -n $ISSUER_NAMESPACE --timeout=70s) ]]; then
echo "⚠️ Certificate request did not become ready within the timeout period."
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| // Include all fields that affect the client connection | ||
| h.Write([]byte(config.Hostname)) | ||
| h.Write([]byte(config.APIPath)) | ||
| h.Write(config.CaCertsBytes) |
| | `IMAGE_TAG` | Docker image version to test | `2.5.0` | | ||
| | `HELM_CHART_VERSION` | Helm chart version | `2.5.0` | |
| // Include all fields that affect the client connection | ||
| h.Write([]byte(config.Hostname)) | ||
| h.Write([]byte(config.APIPath)) | ||
| h.Write(config.CaCertsBytes) |
| # Configure e2e/.env with Command instance credentials before running | ||
| .PHONY: test-e2e | ||
| test-e2e: ## Run e2e tests against a Kubernetes cluster | ||
| cd e2e && source .env && ./run_tests.sh |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a shared in-memory Command client cache so controllers can reuse OAuth token sources across reconciliations, reducing token endpoint traffic and avoiding re-authentication per request.
Changes:
- Introduces
internal/command/ClientCachekeyed by a configuration hash, and wires it into controller builders fromcmd/main.go. - Adds unit tests for cache/hash basics.
- Updates e2e tooling/docs to support running against the current kubeconfig context and allow env overrides.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
internal/command/client_cache.go |
Implements a thread-safe cache for reusing signer/health-checker clients keyed by config hash. |
internal/command/client_cache_test.go |
Adds unit tests around config hashing and basic cache operations. |
cmd/main.go |
Creates a shared cache instance and injects cached builder funcs into reconcilers. |
go.mod |
Updates Go/module directives and bumps keyfactor-auth-client-go version. |
go.sum |
Records checksum updates for the dependency bump. |
e2e/run_tests.sh |
Allows environment overrides; supports running against current kubeconfig and optional minikube usage. |
e2e/README.md |
Updates e2e documentation to match new script behavior and configuration knobs. |
e2e/.gitignore |
Ignores .env used for e2e configuration. |
Makefile |
Updates test-e2e target to run the e2e script with .env. |
cluster-issuer.yaml |
Adds an example ClusterIssuer manifest. |
Dockerfile |
Minor whitespace-only change. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| certificateAuthorityLogicalName: Sub-CA | ||
| certificateTemplate: Server_tlsServerAuth-1y | ||
| commandSecretName: auth-secret | ||
| hostname: matthew-irby-1.kftestlab.com |
There was a problem hiding this comment.
This manifest appears to contain environment-specific values (e.g., a personal/testlab hostname and specific template/CA names). To keep the repo distributable and avoid leaking internal details, consider replacing these with clear placeholders (example.com, YOUR_CA_NAME, YOUR_TEMPLATE, etc.) and/or moving it under docs/examples with guidance.
| certificateAuthorityLogicalName: Sub-CA | |
| certificateTemplate: Server_tlsServerAuth-1y | |
| commandSecretName: auth-secret | |
| hostname: matthew-irby-1.kftestlab.com | |
| certificateAuthorityLogicalName: YOUR_CA_LOGICAL_NAME | |
| certificateTemplate: YOUR_CERTIFICATE_TEMPLATE_NAME | |
| commandSecretName: auth-secret | |
| hostname: your-keyfactor-hostname.example.com |
| - An enrollment pattern (default: "Default Pattern") with CSR Enrollment enabled | ||
| - A security role (default: "InstanceOwner") with Enrollment permissions | ||
|
|
||
| On the Command side: | ||
| - An enrollment pattern is created called "Test Enrollment Pattern" that is has CSR Enrollment, CSR Generation, and PFX Enrollment enabled |
There was a problem hiding this comment.
The Requirements section says the enrollment pattern default is "Default Pattern" (line 31), but a few lines later it states a pattern named "Test Enrollment Pattern" must exist (line 35). These conflict; please align the wording (and the script defaults) so readers know which pattern name is expected by default.
| - An enrollment pattern (default: "Default Pattern") with CSR Enrollment enabled | |
| - A security role (default: "InstanceOwner") with Enrollment permissions | |
| On the Command side: | |
| - An enrollment pattern is created called "Test Enrollment Pattern" that is has CSR Enrollment, CSR Generation, and PFX Enrollment enabled | |
| - An enrollment pattern (default: "Test Enrollment Pattern") with CSR Enrollment, CSR Generation, and PFX Enrollment enabled | |
| - A security role (default: "InstanceOwner") with Enrollment permissions | |
| On the Command side: | |
| - An enrollment pattern named "Test Enrollment Pattern" exists with CSR Enrollment, CSR Generation, and PFX Enrollment enabled |
| | `IMAGE_TAG` | Docker image version to test | `2.5.0` | | ||
| | `HELM_CHART_VERSION` | Helm chart version | `2.5.0` | |
There was a problem hiding this comment.
The Optional variables table documents default IMAGE_TAG/HELM_CHART_VERSION as 2.5.0, but run_tests.sh now defaults both to local via ${VAR:-local}. Update the README defaults (or change the script defaults) so the documented behavior matches what the e2e script actually does.
| | `IMAGE_TAG` | Docker image version to test | `2.5.0` | | |
| | `HELM_CHART_VERSION` | Helm chart version | `2.5.0` | | |
| | `IMAGE_TAG` | Docker image version to test | `local` | | |
| | `HELM_CHART_VERSION` | Helm chart version | `local` | |
| // Include all fields that affect the client connection | ||
| h.Write([]byte(config.Hostname)) | ||
| h.Write([]byte(config.APIPath)) | ||
| h.Write(config.CaCertsBytes) | ||
|
|
||
| if config.BasicAuth != nil { | ||
| h.Write([]byte("basic")) | ||
| h.Write([]byte(config.BasicAuth.Username)) | ||
| h.Write([]byte(config.BasicAuth.Password)) | ||
| } | ||
|
|
||
| if config.OAuth != nil { | ||
| h.Write([]byte("oauth")) | ||
| h.Write([]byte(config.OAuth.TokenURL)) | ||
| h.Write([]byte(config.OAuth.ClientID)) | ||
| h.Write([]byte(config.OAuth.ClientSecret)) | ||
| h.Write([]byte(config.OAuth.Audience)) | ||
| for _, scope := range config.OAuth.Scopes { | ||
| h.Write([]byte(scope)) | ||
| } |
There was a problem hiding this comment.
configHash builds the hash by concatenating raw field bytes without any delimiters/length-prefixing. This can produce real cache-key collisions (e.g., Hostname/APIPath boundary ambiguity, or scope list concatenation), which could cause the wrong cached client to be reused for a different config. Consider hashing a canonical, unambiguous encoding (e.g., length-prefixed writes, JSON with stable field order, or gob) so different tuples cannot collide.
| // Invalidate removes a cached client for the given config. | ||
| // This should be called when an issuer's credentials are updated. | ||
| func (c *ClientCache) Invalidate(config *Config) { | ||
| key := configHash(config) | ||
| c.mu.Lock() | ||
| defer c.mu.Unlock() | ||
| delete(c.clients, key) | ||
| } | ||
|
|
||
| // InvalidateAll removes all cached clients. | ||
| // This can be used during shutdown or when a global credential refresh is needed. | ||
| func (c *ClientCache) InvalidateAll() { | ||
| c.mu.Lock() | ||
| defer c.mu.Unlock() | ||
| c.clients = make(map[string]*cachedClient) | ||
| } |
There was a problem hiding this comment.
The cache has no eviction strategy and Invalidate/InvalidateAll are not called anywhere in the codebase. With credential rotation (new secret contents => new hash), the cache will grow unbounded over time and retain old clients indefinitely. Either wire invalidation to the issuer/secret update path, or add a bounded cache (size/TTL) to prevent memory growth in long-running controllers.
| func TestClientCache_BasicOperations(t *testing.T) { | ||
| cache := NewClientCache() | ||
|
|
||
| // Initial size should be 0 | ||
| if cache.Size() != 0 { | ||
| t.Errorf("expected empty cache, got size %d", cache.Size()) | ||
| } | ||
|
|
||
| // After invalidating a non-existent config, size should still be 0 | ||
| cache.Invalidate(&Config{Hostname: "test.example.com"}) | ||
| if cache.Size() != 0 { | ||
| t.Errorf("expected empty cache after invalidating non-existent, got size %d", cache.Size()) | ||
| } | ||
|
|
||
| // InvalidateAll on empty cache should work | ||
| cache.InvalidateAll() | ||
| if cache.Size() != 0 { | ||
| t.Errorf("expected empty cache after InvalidateAll, got size %d", cache.Size()) | ||
| } | ||
| } |
There was a problem hiding this comment.
The ClientCache tests only exercise Size, Invalidate, and InvalidateAll on an empty cache, but don’t verify the core behavior this PR relies on (same config returns the same cached signer/health checker; different configs create different entries; concurrent calls only create once). Adding unit tests that cover GetOrCreateSigner/GetOrCreateHealthChecker would better protect against regressions in the caching logic.
| module github.com/Keyfactor/command-cert-manager-issuer | ||
|
|
||
| go 1.24 | ||
|
|
||
| toolchain go1.24.0 | ||
| go 1.24.0 | ||
|
|
||
| require ( | ||
| github.com/Keyfactor/keyfactor-auth-client-go v1.3.0 | ||
| github.com/Keyfactor/keyfactor-auth-client-go v1.3.1 | ||
| github.com/Keyfactor/keyfactor-go-client-sdk/v25 v25.0.2 |
There was a problem hiding this comment.
go 1.24.0 is not a valid go directive format in go.mod (the directive is the language version and is typically major.minor, e.g. go 1.24). As written, this can cause go mod/build to fail. If the intent was to pin the toolchain patch version, use the toolchain go1.24.0 directive instead and keep go 1.24.
See #63