Implemented an automated cleanup of old and unused images#5
Implemented an automated cleanup of old and unused images#5valeryia-hurynovich merged 15 commits intomasterfrom
Conversation
6edd645 to
3aad482
Compare
3aad482 to
4c020fa
Compare
|
I see the "old" part, but where do you filter out that currently referenced images are not being deleted? |
Indeed, I missed this part, will add this part of functionality |
|
Thanks for the PR! I think we need to revisit the referenced image protection logic. This controller is responsible for creating Since we have access to Garden API where Wdyt? |
0c8585e to
0ce9d65
Compare
e6c127a to
9e9a1ee
Compare
9e9a1ee to
4f21398
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughBumps Go versions in CI and Docker; adds per-image GarbageCollection config and GC implementation with a pluggable OCIFactory; records image creation timestamps; adds Shoot CRD; updates deepcopy, CRD schemas, tests, and workflows. Changes
Sequence Diagram(s)sequenceDiagram
participant Reconciler as ManagedCloudProfile<br/>Reconciler
participant OCI as OCI Source
participant K8s as Kubernetes API
participant CP as CloudProfile Storage
Reconciler->>OCI: OCIFactory(params, insecure) → initialize (if GC enabled)
Reconciler->>OCI: List available image versions (include CreatedAt)
OCI-->>Reconciler: Return versions list
Reconciler->>K8s: Query referenced versions from CloudProfile & Shoots
K8s-->>Reconciler: Return referenced versions
Reconciler->>Reconciler: Compute deletable = versions NOT referenced AND older than MaxAge
loop For each deletable version
Reconciler->>CP: Remove version from CloudProfile and providerConfig
CP-->>Reconciler: Confirm / report errors
end
alt All deletions successful
Reconciler->>CP: Patch status/conditions (success)
else Any error
Reconciler->>CP: Patch status/conditions (failure reason)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip Migrating from UI to YAML configuration.Use the |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
controllers/managedcloudprofile_controller_test.go (1)
311-394: Misleading test name: test verifies ProviderConfig, not Shoot worker pools.The test name states "preserves old machine image versions referenced by Shoot worker pools," but the test actually verifies preservation based on
ProviderConfig.MachineImagesin the CloudProfile—no Shoot resource is created. This is different from the subsequent test at lines 396-487 which correctly creates a Shoot.Consider renaming to clarify the actual behavior being tested:
📝 Suggested rename
- It("preserves old machine image versions referenced by Shoot worker pools", func(ctx SpecContext) { + It("preserves old machine image versions referenced in ProviderConfig", func(ctx SpecContext) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/managedcloudprofile_controller_test.go` around lines 311 - 394, The test's It description ("preserves old machine image versions referenced by Shoot worker pools") is misleading because the test verifies preservation via CloudProfile.Spec.ProviderConfig.MachineImages (and not any Shoot or worker pools); update the It string to accurately describe the behavior (e.g., "preserves old machine image versions referenced by CloudProfile ProviderConfig/MachineImages") so readers can find the test intent next to the code paths exercised (look for the Ginkgo It block with the current description and references to CloudProfile.Spec.ProviderConfig, ProviderConfig.MachineImages, and ManagedCloudProfile/MachineImageUpdates).controllers/managedcloudprofile_controller.go (1)
269-285: Consider filtering Shoots server-side for efficiency.This lists all Shoots across all namespaces and filters client-side. For clusters with many Shoots, this could be inefficient during each reconciliation.
If a field index exists for
spec.cloudProfile.name, consider usingclient.MatchingFields. Alternatively, since GC runs periodically (5-minute requeue), this may be acceptable overhead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/managedcloudprofile_controller.go` around lines 269 - 285, The current code lists all Shoots into shootList via r.List(ctx, shootList, client.InNamespace(metav1.NamespaceAll)) and then filters client-side by shoot.Spec.CloudProfile.Name (cloudProfileName), which is inefficient; change the List call to perform server-side filtering using client.MatchingFields (or client.MatchingField) for the field key used by the field index for spec.cloudProfile.name so only matching Shoots are returned, and ensure a field index exists for that key (or add one where indexes are registered) before using it; keep the subsequent loop that checks worker.Machine.Image.Name (imageName) and populates referenced unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controllers/managedcloudprofile_controller.go`:
- Around line 171-180: The status patch message is a copy-paste from a different
error: in the block that handles failures from deleteVersion you must change the
condition Message (and align the returned error text) to reference the deletion
failure instead of "failed to list source versions for garbage collection";
update the fmt.Sprintf message passed to patchStatusAndCondition (for
CloudProfileAppliedConditionType on the ManagedCloudProfile/mcp) to something
like "failed to delete image versions: %s" and ensure the final returned
fmt.Errorf also uses the same deletion-oriented wording so messages from
patchStatusAndCondition and the returned error consistently reflect
deleteVersion failures.
- Around line 214-241: The code currently ignores errors from json.Unmarshal and
json.Marshal which can leave cp.Spec.MachineImages and
cp.Spec.ProviderConfig.Raw inconsistent; update the block around
json.Unmarshal(json.Unmarshal(...)) and json.Marshal(json.Marshal(cfg)) to
handle errors: if json.Unmarshal fails, log the error (including context:
imageName and cp.Name) and return the error from the reconciler (or otherwise
abort the update) so no partial changes occur; after building cfg and
marshaling, if json.Marshal fails, log and return the error instead of silently
skipping the update, and only assign cp.Spec.ProviderConfig.Raw = raw when
marshal succeeded; reference the symbols cp.Spec.ProviderConfig, json.Unmarshal,
json.Marshal, cfg, cfg.MachineImages, and cp.Spec.ProviderConfig.Raw when making
these changes.
- Around line 246-288: The function getReferencedVersions currently swallows
errors from r.Get and r.List which can lead to accidental deletion of in-use
images; change its signature to return (map[string]bool, error), propagate and
return any errors from r.Get and r.List (including json.Unmarshal failures)
instead of ignoring them, and update the caller that invokes
getReferencedVersions (the reconciler code that previously did
referencedVersions := r.getReferencedVersions(...)) to handle the error (e.g.,
log and skip GC on error). Ensure to update all call sites to check the returned
error and retain existing behavior only on nil error.
---
Nitpick comments:
In `@controllers/managedcloudprofile_controller_test.go`:
- Around line 311-394: The test's It description ("preserves old machine image
versions referenced by Shoot worker pools") is misleading because the test
verifies preservation via CloudProfile.Spec.ProviderConfig.MachineImages (and
not any Shoot or worker pools); update the It string to accurately describe the
behavior (e.g., "preserves old machine image versions referenced by CloudProfile
ProviderConfig/MachineImages") so readers can find the test intent next to the
code paths exercised (look for the Ginkgo It block with the current description
and references to CloudProfile.Spec.ProviderConfig,
ProviderConfig.MachineImages, and ManagedCloudProfile/MachineImageUpdates).
In `@controllers/managedcloudprofile_controller.go`:
- Around line 269-285: The current code lists all Shoots into shootList via
r.List(ctx, shootList, client.InNamespace(metav1.NamespaceAll)) and then filters
client-side by shoot.Spec.CloudProfile.Name (cloudProfileName), which is
inefficient; change the List call to perform server-side filtering using
client.MatchingFields (or client.MatchingField) for the field key used by the
field index for spec.cloudProfile.name so only matching Shoots are returned, and
ensure a field index exists for that key (or add one where indexes are
registered) before using it; keep the subsequent loop that checks
worker.Machine.Image.Name (imageName) and populates referenced unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 79d0a92a-6377-4ba1-866d-0c75104a070a
📒 Files selected for processing (12)
.github/workflows/checks.yaml.github/workflows/ci.yaml.github/workflows/codeql.yamlDockerfileapi/v1alpha1/managedcloudprofile.goapi/v1alpha1/zz_generated.deepcopy.gocloudprofilesync/source.gocontrollers/managedcloudprofile_controller.gocontrollers/managedcloudprofile_controller_test.gocontrollers/suite_test.gocrd/cloudprofilesync.cobaltcore.dev_managedcloudprofiles.yamlcrd/core.gardener.cloud_shoots.yaml
6f5275a to
e5daea6
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
crd/core.gardener.cloud_shoots.yaml (2)
68-70: Consider enabling the status subresource.The status is defined but no subresource is enabled. If the controller needs to update Shoot status independently of spec changes (standard Kubernetes pattern), add a
subresourcessection.🔧 Suggested addition after schema block
status: description: Status contains the current status of the Shoot. type: object + subresources: + status: {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crd/core.gardener.cloud_shoots.yaml` around lines 68 - 70, The CRD defines the status object for the Shoot but does not enable the status subresource; add a top-level subresources block to the CRD enabling status so controllers can update Shoot.status independently of spec (i.e., add a subresources: { status: {} } entry adjacent to the schema definition). Ensure the block is added at the same CRD root level where the schema and metadata/kind for Shoot are defined so the Kubernetes API server exposes the status subresource for updates.
33-67: Consider addingrequiredfield validations for critical properties.The spec schema lacks
requiredconstraints. Fields likecloudProfile.nameandprovider.workers[].machine.image.name/versionare likely mandatory for GC logic to function correctly. Without validation, malformed Shoot resources could cause nil pointer issues or silent failures during garbage collection.🔧 Suggested schema enhancement
spec: description: Spec defines the desired state of the Shoot. type: object + required: + - cloudProfile properties: cloudProfile: description: Reference to the CloudProfile used by the Shoot. type: object + required: + - name properties: name: description: Name of the CloudProfile. type: string🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crd/core.gardener.cloud_shoots.yaml` around lines 33 - 67, Add JSON Schema "required" constraints to the Shoot CRD spec to enforce mandatory fields; specifically mark spec, spec.cloudProfile, spec.cloudProfile.name, spec.provider, spec.provider.workers (the array), and within each worker item require machine and machine.image and then require image.name and image.version (use items.required on the workers array and required arrays inside the machine/image object schemas). Ensure the required arrays are added alongside the existing properties for spec, cloudProfile, provider, workers, machine, and image so the API server rejects resources missing these critical fields (refer to the symbols spec, cloudProfile.name, provider.workers[].machine.image.name and provider.workers[].machine.image.version).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controllers/managedcloudprofile_controller_test.go`:
- Around line 311-394: The test "preserves old machine image versions referenced
by Shoot worker pools" (in managedcloudprofile_controller_test.go) does not
create a Shoot and thus only preserves 1.0.0 because it exists in
ProviderConfig; either rename the It description to reflect ProviderConfig-based
preservation (change the test name string) or add a Shoot resource that
references image version "1.0.0" (create a gardenerv1beta1.Shoot with a worker
pool using machineImage name "preserve-image" and version "1.0.0") before
creating the ManagedCloudProfile (mcp) so the reconciliation logic that inspects
Shoot worker pools (similar to the working test "preserves machine image
versions referenced by Shoot workers") will actually be exercised; update
cleanup to delete the Shoot as well.
- Around line 36-80: The AfterEach cleanup is missing deletion of created
Shoots, which can leak between tests; update the AfterEach (where k8sClient is
used) to list gardenv1beta1.ShootList, iterate over items and call
k8sClient.Delete(ctx, &shoot) for each Shoot, and add an Eventually block that
lists gardenv1beta1.ShootList and asserts its length is 0 to ensure asynchronous
deletion has completed.
---
Nitpick comments:
In `@crd/core.gardener.cloud_shoots.yaml`:
- Around line 68-70: The CRD defines the status object for the Shoot but does
not enable the status subresource; add a top-level subresources block to the CRD
enabling status so controllers can update Shoot.status independently of spec
(i.e., add a subresources: { status: {} } entry adjacent to the schema
definition). Ensure the block is added at the same CRD root level where the
schema and metadata/kind for Shoot are defined so the Kubernetes API server
exposes the status subresource for updates.
- Around line 33-67: Add JSON Schema "required" constraints to the Shoot CRD
spec to enforce mandatory fields; specifically mark spec, spec.cloudProfile,
spec.cloudProfile.name, spec.provider, spec.provider.workers (the array), and
within each worker item require machine and machine.image and then require
image.name and image.version (use items.required on the workers array and
required arrays inside the machine/image object schemas). Ensure the required
arrays are added alongside the existing properties for spec, cloudProfile,
provider, workers, machine, and image so the API server rejects resources
missing these critical fields (refer to the symbols spec, cloudProfile.name,
provider.workers[].machine.image.name and
provider.workers[].machine.image.version).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b2d20438-0c82-4c54-a759-0315b1fada47
📒 Files selected for processing (3)
controllers/managedcloudprofile_controller_test.gocrd/cloudprofilesync.cobaltcore.dev_managedcloudprofiles.yamlcrd/core.gardener.cloud_shoots.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- crd/cloudprofilesync.cobaltcore.dev_managedcloudprofiles.yaml
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controllers/managedcloudprofile_controller.go`:
- Around line 267-283: getReferencedVersions currently treats every image listed
in cp.Spec.ProviderConfig (unmarshalled into providercfg.CloudProfileConfig) as
a referenced version, creating a circular reference with updateMachineImages and
preventing GC; remove or disable the ProviderConfig-based loop in
getReferencedVersions (the block starting with "if cp.Spec.ProviderConfig != nil
{ ... }") so only Shoot worker pool checks (the code that iterates Shoots and
worker pools) mark versions as referenced, ensuring newly-synced ProviderConfig
entries don't permanently block garbage collection.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2b80b9ad-b29e-4648-a4ec-c0eb20d091a4
📒 Files selected for processing (1)
controllers/managedcloudprofile_controller.go
8cb8455 to
1e38ff4
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
controllers/managedcloudprofile_controller_test.go (1)
36-80:⚠️ Potential issue | 🟡 MinorMissing Shoot cleanup in AfterEach causes potential test pollution.
Tests at lines 357, 446 create Shoot resources but the
AfterEachblock doesn't clean them up. If a test fails before reaching its manual cleanup, Shoots will persist and could affect subsequent tests.,
Add Shoot cleanup to AfterEach
Eventually(func(g Gomega) int { var updated corev1.SecretList g.Expect(k8sClient.List(ctx, &updated)).To(Succeed()) count := 0 for _, sec := range updated.Items { if sec.Namespace == metav1.NamespaceDefault && sec.Name == "oci" { count++ } } return count }).Should(Equal(0)) + + var shootList gardenerv1beta1.ShootList + Expect(k8sClient.List(ctx, &shootList)).To(Succeed()) + for _, shoot := range shootList.Items { + Expect(k8sClient.Delete(ctx, &shoot)).To(Succeed()) + } + + Eventually(func(g Gomega) int { + var updated gardenerv1beta1.ShootList + g.Expect(k8sClient.List(ctx, &updated)).To(Succeed()) + return len(updated.Items) + }).Should(Equal(0)) })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/managedcloudprofile_controller_test.go` around lines 36 - 80, The AfterEach cleanup is missing deletion of Shoot resources; update the AfterEach block to list and delete all Shoot objects created by tests using the k8sClient (e.g., declare a gardenv1beta1.ShootList named shootList, call Expect(k8sClient.List(ctx, &shootList)).To(Succeed()), iterate and Expect(k8sClient.Delete(ctx, &shoot)) for each item), and add an Eventually assertion (similar to the ones for ManagedCloudProfile, CloudProfile and Secret) that waits until len(updatedShootList.Items) == 0 to avoid test pollution; reference the AfterEach function, k8sClient.List/Delete calls, and the new shootList/updatedShootList variables when implementing.
🧹 Nitpick comments (1)
controllers/managedcloudprofile_controller.go (1)
165-168: Consider logging when CreatedAt is missing.Versions with a zero
CreatedAtare silently skipped and will never be garbage collected. While this is a safe default, it could lead to accumulated orphan versions without operators knowing why. A debug-level log would aid troubleshooting.Suggested improvement
for _, v := range versions { if v.CreatedAt.IsZero() { + log.V(1).Info("skipping version without creation timestamp during GC", "image", updates.ImageName, "version", v.Version) continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/managedcloudprofile_controller.go` around lines 165 - 168, The loop that skips versions when v.CreatedAt.IsZero() (the for _, v := range versions loop) silently discards those entries; add a debug-level log inside that branch to record which version was skipped (use identifiable fields on v such as v.Name or v.ID) and why (e.g., "skipping version with zero CreatedAt"), using the controller's logger (e.g., r.Log.Debug or the existing logger variable) so operators can discover orphaned versions during troubleshooting.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controllers/managedcloudprofile_controller_test.go`:
- Around line 590-634: Rename or adjust the test so its name and setup match
what it actually asserts: the current test overrides controllers.OCIFactory
causing updateMachineImages (controller.updateMachineImages) to fail, yielding
Condition Type controllers.CloudProfileAppliedConditionType with Reason
"ApplyFailed" and message "failed to retrieve images version"; either rename the
It(...) description to reflect OCI source/listing failures (e.g., "reports
failure when OCI source listing errors occur") or change the setup so
updateMachineImages does not use the fakeSource (remove/configure
MachineImageUpdates or OCIFactory) and instead inject a failure that triggers
the garbage collection path to assert Reason "GarbageCollectionFailed" for
GC-specific behavior.
In `@controllers/managedcloudprofile_controller.go`:
- Around line 107-110: The getCredential failure path in the GC flow
(r.getCredential called with updates.Source.OCI.Password) returns the error
without updating the ManagedCloudProfile status; mirror the other GC error paths
by setting a descriptive condition on the MCP status (e.g.,
"CredentialRetrievalFailed" with the error message) and patching the status
before returning. Locate the GC handler where r.getCredential is invoked, create
the same style of status condition used elsewhere in the file, call the
controller's status patch method (the same helper used for OCI factory/version
errors), include the error text in the condition, then return the error as
before.
---
Duplicate comments:
In `@controllers/managedcloudprofile_controller_test.go`:
- Around line 36-80: The AfterEach cleanup is missing deletion of Shoot
resources; update the AfterEach block to list and delete all Shoot objects
created by tests using the k8sClient (e.g., declare a gardenv1beta1.ShootList
named shootList, call Expect(k8sClient.List(ctx, &shootList)).To(Succeed()),
iterate and Expect(k8sClient.Delete(ctx, &shoot)) for each item), and add an
Eventually assertion (similar to the ones for ManagedCloudProfile, CloudProfile
and Secret) that waits until len(updatedShootList.Items) == 0 to avoid test
pollution; reference the AfterEach function, k8sClient.List/Delete calls, and
the new shootList/updatedShootList variables when implementing.
---
Nitpick comments:
In `@controllers/managedcloudprofile_controller.go`:
- Around line 165-168: The loop that skips versions when v.CreatedAt.IsZero()
(the for _, v := range versions loop) silently discards those entries; add a
debug-level log inside that branch to record which version was skipped (use
identifiable fields on v such as v.Name or v.ID) and why (e.g., "skipping
version with zero CreatedAt"), using the controller's logger (e.g., r.Log.Debug
or the existing logger variable) so operators can discover orphaned versions
during troubleshooting.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 47fde647-7fe4-423a-bb9c-2f37ff400dcf
📒 Files selected for processing (2)
controllers/managedcloudprofile_controller.gocontrollers/managedcloudprofile_controller_test.go
1e38ff4 to
1c41e88
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
controllers/managedcloudprofile_controller.go (1)
158-159: Avoid listing all Shoots once per image update.
getReferencedVersionsdoes a full Shoot list each invocation; called inside the machine-image loop this scales as O(imageUpdates × shoots). Consider listing once per reconcile and reusing an index keyed by(cloudProfileName, imageName).Also applies to: 272-275
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/managedcloudprofile_controller.go` around lines 158 - 159, getReferencedVersions is called repeatedly inside the machine-image update loop and triggers a full Shoot list each time, causing O(images × shoots) work; instead, list Shoots once per reconcile and build an in-memory index keyed by (cloudProfileName,imageName) that maps to referenced versions. Modify the reconcile so you perform a single Shoot list at the start (or at the top of the reconcile function), construct a map (e.g., referencedVersionsIndex[cloudProfileName+"|"+imageName] = []versions) by iterating Shoots and extracting image references, and then replace direct calls to getReferencedVersions(ctx, mcp.Name, updates.ImageName) (and the similar call at lines ~272-275) with lookups into that index; keep getReferencedVersions logic for on-demand use or refactor it to accept a precomputed Shoot list to preserve behavior if needed.controllers/managedcloudprofile_controller_test.go (1)
511-553: Clarify apply-vs-GC intent in these GC-labeled failure tests.Both tests are named as GC failures, but setup can fail in the apply phase first. Consider either renaming to apply-path failure behavior or asserting explicit
Reasonto lock intent.Also applies to: 555-593
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@controllers/managedcloudprofile_controller_test.go` around lines 511 - 553, Test name and assertions are ambiguous about whether the failure is in apply vs garbage-collection; either rename the It block (e.g., change "handles missing credential for GC OCI source" to "handles missing credential during apply for GC OCI source") or add an explicit assertion on the condition Reason to lock the intent (locate the ManagedCloudProfile test case using the It block and assert on mcp.Status.Conditions entries that the CloudProfileAppliedConditionType has a specific Reason string emitted by the controller, e.g., add Expect(...).To(HaveField("Reason", Equal("<controller-reason-for-apply-failure>"))) so the test documents whether it is an apply-path failure or a GC-path failure).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controllers/managedcloudprofile_controller_test.go`:
- Around line 49-53: Replace the current immediate deletions of Shoot objects
(using k8sClient.Delete and iterating over shootList.Items) with a deletion loop
followed by an assertion that cleanup completed: after calling k8sClient.Delete
for each shoot, use Eventually with a short polling interval to re-list into
gardenerv1beta1.ShootList via k8sClient.List(ctx, &shootList) and
Expect(len(shootList.Items)).To(BeZero()) (or To(Equal(0))); reference the
existing k8sClient, ctx, gardenerv1beta1.ShootList, Eventually and Expect
symbols when adding this check so the test waits until all Shoots are Gone and
prevents cross-test pollution.
In `@controllers/managedcloudprofile_controller.go`:
- Around line 172-183: Before computing cutoff, validate that
updates.GarbageCollection.MaxAge.Duration is non-negative; if it's negative,
avoid treating it as a valid retention period (either clamp it to zero or skip
garbage collection) so you don't compute a future cutoff and delete unreferenced
versions incorrectly. Update the block around cutoff :=
time.Now().Add(-updates.GarbageCollection.MaxAge.Duration) (referencing cutoff,
updates.GarbageCollection.MaxAge.Duration, versions, versionsToDelete,
referencedVersions, and v.CreatedAt) to check Duration >= 0, handle/log the
invalid value, and only build versionsToDelete when the MaxAge is valid.
---
Nitpick comments:
In `@controllers/managedcloudprofile_controller_test.go`:
- Around line 511-553: Test name and assertions are ambiguous about whether the
failure is in apply vs garbage-collection; either rename the It block (e.g.,
change "handles missing credential for GC OCI source" to "handles missing
credential during apply for GC OCI source") or add an explicit assertion on the
condition Reason to lock the intent (locate the ManagedCloudProfile test case
using the It block and assert on mcp.Status.Conditions entries that the
CloudProfileAppliedConditionType has a specific Reason string emitted by the
controller, e.g., add Expect(...).To(HaveField("Reason",
Equal("<controller-reason-for-apply-failure>"))) so the test documents whether
it is an apply-path failure or a GC-path failure).
In `@controllers/managedcloudprofile_controller.go`:
- Around line 158-159: getReferencedVersions is called repeatedly inside the
machine-image update loop and triggers a full Shoot list each time, causing
O(images × shoots) work; instead, list Shoots once per reconcile and build an
in-memory index keyed by (cloudProfileName,imageName) that maps to referenced
versions. Modify the reconcile so you perform a single Shoot list at the start
(or at the top of the reconcile function), construct a map (e.g.,
referencedVersionsIndex[cloudProfileName+"|"+imageName] = []versions) by
iterating Shoots and extracting image references, and then replace direct calls
to getReferencedVersions(ctx, mcp.Name, updates.ImageName) (and the similar call
at lines ~272-275) with lookups into that index; keep getReferencedVersions
logic for on-demand use or refactor it to accept a precomputed Shoot list to
preserve behavior if needed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bf245499-da6b-4980-8252-408e60a300de
📒 Files selected for processing (2)
controllers/managedcloudprofile_controller.gocontrollers/managedcloudprofile_controller_test.go
1c41e88 to
132d34e
Compare
| errs := make([]error, 0) | ||
| for _, updates := range mcp.Spec.MachineImageUpdates { | ||
| // only call updater when an explicit source is provided | ||
| if updates.Source.OCI == nil { |
There was a problem hiding this comment.
previously on not having an updates.Source.OCI we were adding errors, now this will differ the behaviour, do we want it this way ?
There was a problem hiding this comment.
I think in this case we don't need a error handling since it means that particular MachineImageUpdate doesn’t have an OCI source configured, so there’s nothing for the garbage-collector to do for it and we move on to the next update
There was a problem hiding this comment.
I was referring not only to the garbage collection functionality, but I see thats not part of the revised code, so all good now 👍🏻
|
overall looks very good, good job 👌🏻 |
| errs := make([]error, 0) | ||
| for _, updates := range mcp.Spec.MachineImageUpdates { | ||
| // only call updater when an explicit source is provided | ||
| if updates.Source.OCI == nil { |
There was a problem hiding this comment.
I was referring not only to the garbage collection functionality, but I see thats not part of the revised code, so all good now 👍🏻
| Parallel: 1, | ||
| }, updates.Source.OCI.Insecure) | ||
| if err != nil { | ||
| return ctrl.Result{}, r.failWithStatusUpdate(ctx, mcp, fmt.Sprintf("failed to initialize OCI source for garbage collection: %s", err), fmt.Errorf("failed to initialize OCI source for garbage collection: %w", err)) |
There was a problem hiding this comment.
looks very good, only thing that is bothering me is this message/error string duplication. if you figure out a way to not have that would be great. but is also fine if you want to keep it, your decision 👍🏻
There was a problem hiding this comment.
also small nitpick returning Reconcile's ctrl.Result{} should be a job for the main Reconcile func.
|
|
||
| if len(versionsToDelete) > 0 { | ||
| if err := r.deleteVersion(ctx, mcp.Name, updates.ImageName, versionsToDelete); err != nil { | ||
| if apierrors.IsInvalid(err) { |
There was a problem hiding this comment.
ah yes, you are right. Was confused by the name of the function, though we are deleting some API, but we are really filtering out stuff and update/patch 👍🏻 perhaps name could be something more precise but don't have an exact example.
Merging this branch changes the coverage (2 decrease, 1 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
Related to this ticket: https://github.wdf.sap.corp/cc/unified-kubernetes/issues/1063
Here is a logic of following points:
Summary by CodeRabbit
New Features
Tests
Chores