ACM-35499 avoid blocking incoming requests by KevinFCormier · Pull Request #6337 · stolostron/console

KevinFCormier · 2026-06-15T17:52:36Z

📝 Summary

Ticket Summary (Title):
console-mce pod CrashLoopBackOff due to probe failure with large resource sets

Ticket Link:
https://redhat.atlassian.net/browse/ACM-35499

Type of Change:

✅ Checklist

General

PR title follows the convention (e.g. ACM-12340 Fix bug with...)
Code builds and runs locally without errors
No console logs, commented-out code, or unnecessary files
All commits are meaningful and well-labeled
All new display strings are externalized for localization (English only)
(Nice to have) JSDoc comments added for new functions and interfaces

If Bugfix

Root cause and fix summary are documented in the ticket (for future reference / errata)
Fix tested thoroughly and resolves the issue
Test(s) added to prevent regression

🗒️ Notes for Reviewers

Customer was observing CrashLoopBackoff on console-mce pods and had over 20,000 Group resources present. Initially processing these resources may have blocked us from responding to the readiness and liveness probes on time, causing Kubernetes to restart the pod.

This PR batches Promise.all calls so that we call setImmediate after each batch of 100 to yield the event queue so that new requests can be handled. This seems to smooth out memory usage of the pods. Before this change, I observed a spike at startup, then a retreat to stable size.

Summary by CodeRabbit

Improvements
- Enhanced backend processing to handle large collections in controlled batches instead of launching all work concurrently.
- Improved server-side event handling and resource reconciliation by batching decompression/inflation and list reconciliation steps, helping maintain consistent performance and ordered results.
Tests
- Added coverage for batch processing behavior, including result ordering, empty input handling, batch sizing, event-loop yielding, and error propagation from mappers.

coderabbitai · 2026-06-15T17:52:51Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d87b2238-5a73-44fc-9f91-7dfad00b56b0

📥 Commits

Reviewing files that changed from the base of the PR and between 3a8870d and 60c651e.

📒 Files selected for processing (4)

backend/src/lib/batch-promise-all.ts
backend/src/lib/server-side-events.ts
backend/src/routes/events.ts
backend/test/lib/batch-promise-all.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

backend/src/routes/events.ts
backend/src/lib/server-side-events.ts

📝 Walkthrough

Walkthrough

Adds a generic batchPromiseAll<T, R> utility that executes an async mapper over an input array in fixed-size batches (default 100), inserting setImmediate yields between batches. Three existing Promise.all call sites in server-side-events.ts and events.ts are replaced with this utility. A Jest suite covers ordering, batching, yielding, default size, and error propagation.

Changes

Batch Promise Utility and Adoption

Layer / File(s)	Summary
`batchPromiseAll` implementation `backend/src/lib/batch-promise-all.ts`	Exports constant `BATCH_SIZE = 100` and function `batchPromiseAll<T, R>(items, mapper)` that slices items into 100-item batches, runs `Promise.all` per batch, calls `setImmediate` between batches when additional work remains, and returns ordered aggregated results.
Test suite `backend/test/lib/batch-promise-all.test.ts`	Jest tests verify result ordering, empty input handling, three-batch partitioning against `BATCH_SIZE`, `setImmediate` called when batches exist (multi-batch case), `setImmediate` not called for single-batch input, and mapper errors propagate as promise rejection.
Adoption in `server-side-events.ts` `backend/src/lib/server-side-events.ts`	Imports `batchPromiseAll` and replaces parallel event inflation (`Promise.all(values.map(...))`) with batched inflation in `handleRequest`, controlling concurrency during SSE packet decompression.
Adoption in `events.ts` route `backend/src/routes/events.ts`	Imports `batchPromiseAll` and replaces three `Promise.all` call sites: cached resource decompression in `getKubeResources`, item caching in `listKubernetesObjects`, and stale resource deletion in `listKubernetesObjects`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'ACM-35499 avoid blocking incoming requests' directly addresses the core issue: preventing event loop blocking that was causing probe timeouts.
Description check	✅ Passed	The description follows the template structure, clearly identifies the bug fix type, includes the ticket link, completes the general checklist, and provides context about the customer issue and solution approach.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

backend/test/lib/batch-promise-all.test.ts (1)
67-82: ⚡ Quick win

Strengthen default-batch-size assertion to cover the final partial batch.

Current assertion only verifies two full 100-item groups. It does not fail if the last 50-item batch is accidentally dropped. Add a total-processed assertion (or assert result length) to close that gap.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/test/lib/batch-promise-all.test.ts` around lines 67 - 82, The test
`should use default batch size of 100` only asserts that batchCount equals 2,
which verifies two full batches but does not catch if the final 50-item partial
batch is accidentally dropped. Strengthen the test by adding an assertion that
verifies all 250 input items were actually processed. Either capture the result
returned by batchPromiseAll and assert its length equals 250, or track the total
number of items processed through the callback and assert that total equals 250.
This ensures the partial final batch is properly handled and processed.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/src/lib/batch-promise-all.ts`:
- Around line 11-15: The batchPromiseAll function lacks validation for the
batchSize parameter, which can cause an infinite loop when batchSize is 0 (since
i += batchSize would not advance) or move backwards when negative. Add an
upfront guard at the beginning of the function to validate that batchSize is a
positive integer, throwing an appropriate error if the validation fails. This
check should occur before the for loop that uses the i += batchSize increment.

In `@backend/test/lib/batch-promise-all.test.ts`:
- Around line 39-48: The global.setImmediate restoration is not protected by
try/finally, which means if the await batchPromiseAll call throws an exception,
the restoration code never executes and leaves patched globals in place for
subsequent tests. Wrap the batchPromiseAll invocation and associated test logic
in a try/finally block, moving the global.setImmediate restoration statement
into the finally block to ensure it always executes regardless of whether
batchPromiseAll succeeds or throws. This fix applies to all occurrences where
global.setImmediate is temporarily patched in your test file.

---

Nitpick comments:
In `@backend/test/lib/batch-promise-all.test.ts`:
- Around line 67-82: The test `should use default batch size of 100` only
asserts that batchCount equals 2, which verifies two full batches but does not
catch if the final 50-item partial batch is accidentally dropped. Strengthen the
test by adding an assertion that verifies all 250 input items were actually
processed. Either capture the result returned by batchPromiseAll and assert its
length equals 250, or track the total number of items processed through the
callback and assert that total equals 250. This ensures the partial final batch
is properly handled and processed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 34a7344e-4847-45d3-a33d-9ddbcc10ce5f

📥 Commits

Reviewing files that changed from the base of the PR and between dec9ba6 and 3a8870d.

📒 Files selected for processing (4)

backend/src/lib/batch-promise-all.ts
backend/src/lib/server-side-events.ts
backend/src/routes/events.ts
backend/test/lib/batch-promise-all.test.ts

…oming network requests Assisted-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>

Generated-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>

KevinFCormier · 2026-06-19T15:53:21Z

/cc @fxiang1

KevinFCormier · 2026-06-19T15:54:36Z

/cherry-pick release-2.17

openshift-cherrypick-robot · 2026-06-19T15:54:40Z

@KevinFCormier: once the present PR merges, I will cherry-pick it on top of release-2.17 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-2.17

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

fxiang1 · 2026-06-19T16:26:01Z

/lgtm

openshift-ci · 2026-06-19T16:26:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fxiang1, KevinFCormier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [KevinFCormier,fxiang1]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

KevinFCormier · 2026-06-19T17:31:04Z

/test unit-tests-sonarcloud

sonarqubecloud · 2026-06-19T18:12:47Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
95.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

openshift-cherrypick-robot · 2026-06-19T18:25:38Z

@KevinFCormier: new pull request created: #6357

Details

In response to this:

/cherry-pick release-2.17

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci Bot added dco-signoff: yes approved labels Jun 15, 2026

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread backend/src/lib/batch-promise-all.ts Outdated

Comment thread backend/test/lib/batch-promise-all.test.ts

KevinFCormier added 2 commits June 15, 2026 15:07

Break up large loops to avoid saturating event queue and blocking inc…

e728851

…oming network requests Assisted-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>

Add test coverage

60c651e

Generated-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>

KevinFCormier force-pushed the ACM-35499-avoid-blocking-incoming-requests branch from 3a8870d to 60c651e Compare June 15, 2026 19:07

openshift-ci Bot requested a review from fxiang1 June 19, 2026 15:53

openshift-ci Bot assigned fxiang1 Jun 19, 2026

openshift-ci Bot added the lgtm label Jun 19, 2026

openshift-merge-bot Bot merged commit 6cca438 into stolostron:main Jun 19, 2026
16 checks passed

openshift-cherrypick-robot mentioned this pull request Jun 19, 2026

[release-2.17] ACM-35498 avoid blocking incoming requests #6357

Open

KevinFCormier deleted the ACM-35499-avoid-blocking-incoming-requests branch June 19, 2026 18:56

Conversation

KevinFCormier commented Jun 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Summary

✅ Checklist

General

If Bugfix

🗒️ Notes for Reviewers

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

KevinFCormier commented Jun 19, 2026

Uh oh!

KevinFCormier commented Jun 19, 2026

Uh oh!

openshift-cherrypick-robot commented Jun 19, 2026

Uh oh!

fxiang1 commented Jun 19, 2026

Uh oh!

openshift-ci Bot commented Jun 19, 2026

Uh oh!

KevinFCormier commented Jun 19, 2026

Uh oh!

sonarqubecloud Bot commented Jun 19, 2026

Quality Gate passed

Uh oh!

Uh oh!

openshift-cherrypick-robot commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

KevinFCormier commented Jun 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading