Skip to content

Propagate more errors#267

Open
fwiesel wants to merge 1 commit intomainfrom
more-no-swallow-fixes
Open

Propagate more errors#267
fwiesel wants to merge 1 commit intomainfrom
more-no-swallow-fixes

Conversation

@fwiesel
Copy link
Contributor

@fwiesel fwiesel commented Mar 13, 2026

Both in the traits controller as well as in the placement-api calls, some errors were swallowed.

Summary by CodeRabbit

  • Bug Fixes

    • Preserve and surface multiple errors together during status updates.
    • Ensure cleanup halts and reports failures when deletion of consumer allocations fails.
  • Tests

    • Added tests for error handling when trait retrieval fails.
    • Added tests for handling failures during consumer allocation deletion.

@fwiesel fwiesel requested a review from Copilot March 13, 2026 15:04
@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

Propagate and preserve existing errors during status patching in TraitsController; surface and return errors from consumer-allocation deletion in CleanupResourceProvider. Tests added to assert both failure paths.

Changes

Cohort / File(s) Summary
TraitsController
internal/controller/traits_controller.go, internal/controller/traits_controller_test.go
Preserve original reconcile error when status patching fails by joining errors (errors.Join(err, tc.Status().Patch(...))). Added test that simulates GetTraits returning 500 and asserts reconcile returns an error and no PUT is invoked.
Placement / Cleanup
internal/openstack/placement.go, internal/openstack/placement_test.go
CleanupResourceProvider now checks and returns the error from deleteConsumerAllocations (via .Err) instead of discarding it. Added test that simulates consumer-allocation DELETE failure and asserts cleanup returns an error.

Sequence Diagram(s)

(Skipped — changes are focused error propagation and tests, not a new multi-component control flow that requires diagramming.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Aggregates: Do not swallow error #266: Related change to error composition in internal/controller/traits_controller.go, preserving the original reconcile error and joining with status patch errors.

Suggested reviewers

  • mchristianl

Poem

🐰 I nudge the logs at dawn,

I stitch two errors, not one withdrawn.
When cleanup trips on a thistle thorn,
I shout it out so bugs are shorn.
Hops and fixes — code reborn.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "Propagate more errors" accurately summarizes the main objective of the changeset, which focuses on fixing error handling to propagate errors instead of silently swallowing them in multiple locations.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch more-no-swallow-fixes
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can enforce grammar and style rules using `languagetool`.

Configure the reviews.tools.languagetool setting to enable/disable rules and categories. Refer to the LanguageTool Community to learn more.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two cases where real errors were being dropped during cleanup/reconcile flows, ensuring callers get the underlying failure instead of a nil error.

Changes:

  • Propagate consumer allocation deletion errors in CleanupResourceProvider instead of discarding them.
  • Preserve the original GetTraits error when also patching status conditions in TraitsController.Reconcile.
  • Add regression tests covering both previously-swallowed error paths.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
internal/openstack/placement_test.go Adds a regression test ensuring allocation-delete failures are returned.
internal/openstack/placement.go Returns early when consumer allocation deletion fails (no longer swallows error).
internal/controller/traits_controller_test.go Adds a regression test ensuring GetTraits errors are not lost after status patching.
internal/controller/traits_controller.go Joins the original error with the status patch error instead of overwriting it.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
internal/openstack/placement_test.go (1)

323-327: Strengthen this test by asserting provider deletion is never reached.

Using a failing handler here would lock in the early-return behavior on consumer deletion errors.

🧪 Proposed test hardening
-				// The provider delete succeeds, so that it cannot accidentally surface
-				// the error for us — only the consumer allocation delete error should be returned.
-				fakeServer.Mux.HandleFunc(deleteProviderURL, func(w http.ResponseWriter, r *http.Request) {
-					w.WriteHeader(http.StatusNoContent)
-				})
+				// Provider delete should not be called when consumer allocation deletion fails.
+				fakeServer.Mux.HandleFunc(deleteProviderURL, func(w http.ResponseWriter, r *http.Request) {
+					defer GinkgoRecover()
+					Fail("should not be called")
+				})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/openstack/placement_test.go` around lines 323 - 327, The
provider-delete handler in the test currently returns 204 which lets the
provider path succeed and doesn't assert it was never called; change the
fakeServer.Mux.HandleFunc for deleteProviderURL so that if that handler is
invoked it fails the test (e.g., call t.Fatalf or t.Error with a clear message)
to assert provider deletion is never reached, and ensure the closure captures
the test *t* properly so the failure is reported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/controller/traits_controller_test.go`:
- Around line 297-299: The comment describing a past bug (which says "err is
reassigned to errors.Join(patchErr) instead of errors.Join(originalErr,
patchErr)...") is now outdated and misleading; remove or replace that comment in
the test so it no longer documents pre-fix behavior — locate the comment
referencing errors.Join, patchErr, and GetTraits in the tests (in
traits_controller_test.go) and delete the bug-note or update it to a short,
accurate explanation of the current behavior.

In `@internal/openstack/placement_test.go`:
- Around line 337-339: Remove the outdated bug comment that mentions
deleteConsumerAllocations' return value being discarded in
internal/openstack/placement_test.go; the test and code have been updated so
that error handling is correct, so simply delete the three-line comment
referencing the bug to avoid misleading future readers and keep comments in sync
with the current behavior of deleteConsumerAllocations.

---

Nitpick comments:
In `@internal/openstack/placement_test.go`:
- Around line 323-327: The provider-delete handler in the test currently returns
204 which lets the provider path succeed and doesn't assert it was never called;
change the fakeServer.Mux.HandleFunc for deleteProviderURL so that if that
handler is invoked it fails the test (e.g., call t.Fatalf or t.Error with a
clear message) to assert provider deletion is never reached, and ensure the
closure captures the test *t* properly so the failure is reported.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f1006a62-4195-4f20-b6bc-8b60a85a42a9

📥 Commits

Reviewing files that changed from the base of the PR and between 05f22f6 and 0368bf3.

📒 Files selected for processing (4)
  • internal/controller/traits_controller.go
  • internal/controller/traits_controller_test.go
  • internal/openstack/placement.go
  • internal/openstack/placement_test.go

@fwiesel fwiesel force-pushed the more-no-swallow-fixes branch from 0368bf3 to 7a206a3 Compare March 13, 2026 15:19
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/openstack/placement.go (1)

154-156: Include consumerID in propagated delete errors for diagnosability.

At Line 155, returning the raw error makes it harder to identify which consumer failed when multiple allocations are processed. Consider wrapping with context.

Suggested patch
-		if err := deleteConsumerAllocations(ctx, client, consumerID).Err; err != nil {
-			return err
+		if err := deleteConsumerAllocations(ctx, client, consumerID).Err; err != nil {
+			return fmt.Errorf("failed to delete allocations for consumer %s: %w", consumerID, err)
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/openstack/placement.go` around lines 154 - 156, The delete error
returned from deleteConsumerAllocations is missing the consumer identifier;
update the error return in the block that calls deleteConsumerAllocations(ctx,
client, consumerID) so it wraps or annotates the underlying error with the
consumerID (e.g., use fmt.Errorf or errors.Wrapf) to produce a message like
"deleteConsumerAllocations failed for consumer %s: %w" referencing the
consumerID variable and the original err; modify the return at the call site
(the code that currently returns err) to return the wrapped/annotated error
instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/openstack/placement.go`:
- Around line 154-156: The delete error returned from deleteConsumerAllocations
is missing the consumer identifier; update the error return in the block that
calls deleteConsumerAllocations(ctx, client, consumerID) so it wraps or
annotates the underlying error with the consumerID (e.g., use fmt.Errorf or
errors.Wrapf) to produce a message like "deleteConsumerAllocations failed for
consumer %s: %w" referencing the consumerID variable and the original err;
modify the return at the call site (the code that currently returns err) to
return the wrapped/annotated error instead.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fe8990c4-61ed-413e-b93d-fe02b2389bf0

📥 Commits

Reviewing files that changed from the base of the PR and between 0368bf3 and 7a206a3.

📒 Files selected for processing (4)
  • internal/controller/traits_controller.go
  • internal/controller/traits_controller_test.go
  • internal/openstack/placement.go
  • internal/openstack/placement_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/openstack/placement_test.go

@fwiesel fwiesel force-pushed the more-no-swallow-fixes branch from 7a206a3 to 0a4b33b Compare March 13, 2026 15:28
Both in the traits controller as well as in the placement-api
calls, some errors were swallowed.
@fwiesel fwiesel force-pushed the more-no-swallow-fixes branch from 0a4b33b to b872bd8 Compare March 17, 2026 12:56
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
internal/controller/traits_controller_test.go (1)

294-298: Assert the error source, not only error presence.

This currently passes on any reconcile failure. Please also assert the returned error matches the GetTraits failure path (e.g., contains 500 and/or /resource_providers/1234/traits) so the test guards the intended behavior.

💡 Suggested tightening
 		It("should return the GetTraits error", func(ctx SpecContext) {
 			req := ctrl.Request{NamespacedName: hypervisorName}
 			_, err := traitsController.Reconcile(ctx, req)
 			Expect(err).To(HaveOccurred())
+			Expect(err.Error()).To(ContainSubstring("500"))
+			Expect(err.Error()).To(ContainSubstring("/resource_providers/1234/traits"))
 		})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/traits_controller_test.go` around lines 294 - 298, The
test currently only checks that traitsController.Reconcile returns an error;
tighten it to assert the error comes from the GetTraits failure by checking the
error message contains the expected indicator (e.g., "500" or the GetTraits
endpoint path like "/resource_providers/1234/traits"). Update the spec that
calls traitsController.Reconcile(ctx, req) to capture err and then assert
Expect(err.Error()).To(ContainSubstring("500")) or
Expect(err.Error()).To(ContainSubstring("/resource_providers/1234/traits")) (or
use MatchError with ContainSubstring) so the failure is proven to be from the
GetTraits path.
internal/openstack/placement_test.go (1)

323-338: Make this failure-path test stricter on cause and side effect.

Right now it accepts any error and still permits provider deletion. For this scenario, assert the wrapped consumer-deletion error and fail the test if delete provider is called.

💡 Suggested tightening
-				// The provider delete succeeds, so that it cannot accidentally surface
-				// the error for us — only the consumer allocation delete error should be returned.
 				fakeServer.Mux.HandleFunc(deleteProviderURL, func(w http.ResponseWriter, r *http.Request) {
-					w.WriteHeader(http.StatusNoContent)
+					defer GinkgoRecover()
+					Fail("should not be called")
 				})
 			})
@@
 				err := CleanupResourceProvider(ctx, client.ServiceClient(fakeServer), provider)
 				Expect(err).To(HaveOccurred())
+				Expect(err.Error()).To(ContainSubstring("deleteConsumerAllocations failed for consumer consumer-1"))
 			})
 		})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/openstack/placement_test.go` around lines 323 - 338, The test for
CleanupResourceProvider should be tightened to (1) simulate a concrete
consumer-allocation deletion error in the fake consumer-delete handler and
assert that the returned err from CleanupResourceProvider wraps/equals that
specific error (use errors.Is or Expect(err).To(MatchError(...)) against the
expectedErr), and (2) make the fake provider-delete handler fail the test if it
is ever invoked (e.g., call t.Fatalf or set a flag and Expect it to remain
false) so the test ensures provider deletion does not occur when consumer
deletion fails; update the handlers referenced in the test (the consumer delete
handler and the fakeServer.Mux.HandleFunc for deleteProviderURL) and the
assertion on the returned err from CleanupResourceProvider accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/controller/traits_controller_test.go`:
- Around line 294-298: The test currently only checks that
traitsController.Reconcile returns an error; tighten it to assert the error
comes from the GetTraits failure by checking the error message contains the
expected indicator (e.g., "500" or the GetTraits endpoint path like
"/resource_providers/1234/traits"). Update the spec that calls
traitsController.Reconcile(ctx, req) to capture err and then assert
Expect(err.Error()).To(ContainSubstring("500")) or
Expect(err.Error()).To(ContainSubstring("/resource_providers/1234/traits")) (or
use MatchError with ContainSubstring) so the failure is proven to be from the
GetTraits path.

In `@internal/openstack/placement_test.go`:
- Around line 323-338: The test for CleanupResourceProvider should be tightened
to (1) simulate a concrete consumer-allocation deletion error in the fake
consumer-delete handler and assert that the returned err from
CleanupResourceProvider wraps/equals that specific error (use errors.Is or
Expect(err).To(MatchError(...)) against the expectedErr), and (2) make the fake
provider-delete handler fail the test if it is ever invoked (e.g., call t.Fatalf
or set a flag and Expect it to remain false) so the test ensures provider
deletion does not occur when consumer deletion fails; update the handlers
referenced in the test (the consumer delete handler and the
fakeServer.Mux.HandleFunc for deleteProviderURL) and the assertion on the
returned err from CleanupResourceProvider accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ffad9a31-51e8-44af-8e3b-5a6bca92ae8b

📥 Commits

Reviewing files that changed from the base of the PR and between 7a206a3 and b872bd8.

📒 Files selected for processing (4)
  • internal/controller/traits_controller.go
  • internal/controller/traits_controller_test.go
  • internal/openstack/placement.go
  • internal/openstack/placement_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/controller/traits_controller.go
  • internal/openstack/placement.go

@github-actions
Copy link

Merging this branch will increase overall coverage

Impacted Packages Coverage Δ 🤖
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller 63.95% (+0.86%) 👍
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/openstack 79.10% (+1.66%) 👍

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/traits_controller.go 73.44% (+12.50%) 64 47 (+8) 17 (-8) 🎉
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/openstack/placement.go 93.48% (+4.59%) 46 (+1) 43 (+3) 3 (-2) 👍

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/traits_controller_test.go
  • github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/openstack/placement_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants