Skip to content

ACM-35499 avoid blocking incoming requests#6337

Merged
openshift-merge-bot[bot] merged 2 commits into
stolostron:mainfrom
KevinFCormier:ACM-35499-avoid-blocking-incoming-requests
Jun 19, 2026
Merged

ACM-35499 avoid blocking incoming requests#6337
openshift-merge-bot[bot] merged 2 commits into
stolostron:mainfrom
KevinFCormier:ACM-35499-avoid-blocking-incoming-requests

Conversation

@KevinFCormier

@KevinFCormier KevinFCormier commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

📝 Summary

Ticket Summary (Title):
console-mce pod CrashLoopBackOff due to probe failure with large resource sets

Ticket Link:
https://redhat.atlassian.net/browse/ACM-35499

Type of Change:

  • 🐞 Bug Fix
  • ✨ Feature
  • 🔧 Refactor
  • 💸 Tech Debt
  • 🧪 Test-related
  • 📄 Docs

✅ Checklist

General

  • PR title follows the convention (e.g. ACM-12340 Fix bug with...)
  • Code builds and runs locally without errors
  • No console logs, commented-out code, or unnecessary files
  • All commits are meaningful and well-labeled
  • All new display strings are externalized for localization (English only)
  • (Nice to have) JSDoc comments added for new functions and interfaces

If Bugfix

  • Root cause and fix summary are documented in the ticket (for future reference / errata)
  • Fix tested thoroughly and resolves the issue
  • Test(s) added to prevent regression

🗒️ Notes for Reviewers

Customer was observing CrashLoopBackoff on console-mce pods and had over 20,000 Group resources present. Initially processing these resources may have blocked us from responding to the readiness and liveness probes on time, causing Kubernetes to restart the pod.

This PR batches Promise.all calls so that we call setImmediate after each batch of 100 to yield the event queue so that new requests can be handled. This seems to smooth out memory usage of the pods. Before this change, I observed a spike at startup, then a retreat to stable size.

Summary by CodeRabbit

  • Improvements
    • Enhanced backend processing to handle large collections in controlled batches instead of launching all work concurrently.
    • Improved server-side event handling and resource reconciliation by batching decompression/inflation and list reconciliation steps, helping maintain consistent performance and ordered results.
  • Tests
    • Added coverage for batch processing behavior, including result ordering, empty input handling, batch sizing, event-loop yielding, and error propagation from mappers.

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d87b2238-5a73-44fc-9f91-7dfad00b56b0

📥 Commits

Reviewing files that changed from the base of the PR and between 3a8870d and 60c651e.

📒 Files selected for processing (4)
  • backend/src/lib/batch-promise-all.ts
  • backend/src/lib/server-side-events.ts
  • backend/src/routes/events.ts
  • backend/test/lib/batch-promise-all.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/src/routes/events.ts
  • backend/src/lib/server-side-events.ts

📝 Walkthrough

Walkthrough

Adds a generic batchPromiseAll<T, R> utility that executes an async mapper over an input array in fixed-size batches (default 100), inserting setImmediate yields between batches. Three existing Promise.all call sites in server-side-events.ts and events.ts are replaced with this utility. A Jest suite covers ordering, batching, yielding, default size, and error propagation.

Changes

Batch Promise Utility and Adoption

Layer / File(s) Summary
batchPromiseAll implementation
backend/src/lib/batch-promise-all.ts
Exports constant BATCH_SIZE = 100 and function batchPromiseAll<T, R>(items, mapper) that slices items into 100-item batches, runs Promise.all per batch, calls setImmediate between batches when additional work remains, and returns ordered aggregated results.
Test suite
backend/test/lib/batch-promise-all.test.ts
Jest tests verify result ordering, empty input handling, three-batch partitioning against BATCH_SIZE, setImmediate called when batches exist (multi-batch case), setImmediate not called for single-batch input, and mapper errors propagate as promise rejection.
Adoption in server-side-events.ts
backend/src/lib/server-side-events.ts
Imports batchPromiseAll and replaces parallel event inflation (Promise.all(values.map(...))) with batched inflation in handleRequest, controlling concurrency during SSE packet decompression.
Adoption in events.ts route
backend/src/routes/events.ts
Imports batchPromiseAll and replaces three Promise.all call sites: cached resource decompression in getKubeResources, item caching in listKubernetesObjects, and stale resource deletion in listKubernetesObjects.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'ACM-35499 avoid blocking incoming requests' directly addresses the core issue: preventing event loop blocking that was causing probe timeouts.
Description check ✅ Passed The description follows the template structure, clearly identifies the bug fix type, includes the ticket link, completes the general checklist, and provides context about the customer issue and solution approach.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
backend/test/lib/batch-promise-all.test.ts (1)

67-82: ⚡ Quick win

Strengthen default-batch-size assertion to cover the final partial batch.

Current assertion only verifies two full 100-item groups. It does not fail if the last 50-item batch is accidentally dropped. Add a total-processed assertion (or assert result length) to close that gap.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/test/lib/batch-promise-all.test.ts` around lines 67 - 82, The test
`should use default batch size of 100` only asserts that batchCount equals 2,
which verifies two full batches but does not catch if the final 50-item partial
batch is accidentally dropped. Strengthen the test by adding an assertion that
verifies all 250 input items were actually processed. Either capture the result
returned by batchPromiseAll and assert its length equals 250, or track the total
number of items processed through the callback and assert that total equals 250.
This ensures the partial final batch is properly handled and processed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/src/lib/batch-promise-all.ts`:
- Around line 11-15: The batchPromiseAll function lacks validation for the
batchSize parameter, which can cause an infinite loop when batchSize is 0 (since
i += batchSize would not advance) or move backwards when negative. Add an
upfront guard at the beginning of the function to validate that batchSize is a
positive integer, throwing an appropriate error if the validation fails. This
check should occur before the for loop that uses the i += batchSize increment.

In `@backend/test/lib/batch-promise-all.test.ts`:
- Around line 39-48: The global.setImmediate restoration is not protected by
try/finally, which means if the await batchPromiseAll call throws an exception,
the restoration code never executes and leaves patched globals in place for
subsequent tests. Wrap the batchPromiseAll invocation and associated test logic
in a try/finally block, moving the global.setImmediate restoration statement
into the finally block to ensure it always executes regardless of whether
batchPromiseAll succeeds or throws. This fix applies to all occurrences where
global.setImmediate is temporarily patched in your test file.

---

Nitpick comments:
In `@backend/test/lib/batch-promise-all.test.ts`:
- Around line 67-82: The test `should use default batch size of 100` only
asserts that batchCount equals 2, which verifies two full batches but does not
catch if the final 50-item partial batch is accidentally dropped. Strengthen the
test by adding an assertion that verifies all 250 input items were actually
processed. Either capture the result returned by batchPromiseAll and assert its
length equals 250, or track the total number of items processed through the
callback and assert that total equals 250. This ensures the partial final batch
is properly handled and processed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 34a7344e-4847-45d3-a33d-9ddbcc10ce5f

📥 Commits

Reviewing files that changed from the base of the PR and between dec9ba6 and 3a8870d.

📒 Files selected for processing (4)
  • backend/src/lib/batch-promise-all.ts
  • backend/src/lib/server-side-events.ts
  • backend/src/routes/events.ts
  • backend/test/lib/batch-promise-all.test.ts

Comment thread backend/src/lib/batch-promise-all.ts Outdated
Comment thread backend/test/lib/batch-promise-all.test.ts
…oming network requests

Assisted-by: Cursor (Claude Opus 4.6 High)
Signed-off-by: Kevin Cormier <kcormier@redhat.com>
Generated-by: Cursor (Claude Opus 4.6 High)
Signed-off-by: Kevin Cormier <kcormier@redhat.com>
@KevinFCormier KevinFCormier force-pushed the ACM-35499-avoid-blocking-incoming-requests branch from 3a8870d to 60c651e Compare June 15, 2026 19:07
@KevinFCormier

Copy link
Copy Markdown
Contributor Author

/cc @fxiang1

@openshift-ci openshift-ci Bot requested a review from fxiang1 June 19, 2026 15:53
@KevinFCormier

Copy link
Copy Markdown
Contributor Author

/cherry-pick release-2.17

@openshift-cherrypick-robot

Copy link
Copy Markdown
Contributor

@KevinFCormier: once the present PR merges, I will cherry-pick it on top of release-2.17 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-2.17

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@fxiang1

fxiang1 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci

openshift-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fxiang1, KevinFCormier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [KevinFCormier,fxiang1]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@KevinFCormier

Copy link
Copy Markdown
Contributor Author

/test unit-tests-sonarcloud

@sonarqubecloud

Copy link
Copy Markdown

@openshift-merge-bot openshift-merge-bot Bot merged commit 6cca438 into stolostron:main Jun 19, 2026
16 checks passed
@openshift-cherrypick-robot

Copy link
Copy Markdown
Contributor

@KevinFCormier: new pull request created: #6357

Details

In response to this:

/cherry-pick release-2.17

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@KevinFCormier KevinFCormier deleted the ACM-35499-avoid-blocking-incoming-requests branch June 19, 2026 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants