ACM-35158 avoid blocking incoming requests [release-2.16] by KevinFCormier · Pull Request #6339 · stolostron/console

KevinFCormier · 2026-06-15T20:14:13Z

📝 Summary

Ticket Summary (Title):
console-mce pod CrashLoopBackOff due to probe failure with large resource sets

Ticket Link:
https://redhat.atlassian.net/browse/ACM-35499

Type of Change:

✅ Checklist

General

PR title follows the convention (e.g. ACM-12340 Fix bug with...)
Code builds and runs locally without errors
No console logs, commented-out code, or unnecessary files
All commits are meaningful and well-labeled
All new display strings are externalized for localization (English only)
(Nice to have) JSDoc comments added for new functions and interfaces

If Bugfix

Root cause and fix summary are documented in the ticket (for future reference / errata)
Fix tested thoroughly and resolves the issue
Test(s) added to prevent regression

🗒️ Notes for Reviewers

Customer was observing CrashLoopBackoff on console-mce pods and had over 20,000 Group resources present. Initially processing these resources may have blocked us from responding to the readiness and liveness probes on time, causing Kubernetes to restart the pod.

This PR batches Promise.all calls so that we call setImmediate after each batch of 100 to yield the event queue so that new requests can be handled. This seems to smooth out memory usage of the pods. Before this change, I observed a spike at startup, then a retreat to stable size.

Memory Usage

In the following videos, I deleted pods, then observed the memory usage of the replacements. You can see that before the patch, the pods sometimes have a memory spike during startup, and in this case we see one of the pods spike to almost 12 GB, before falling back to a more typical value. After the patch, there is no initial spike.

Before

Before.Patch.mov

After

After.Patch.mov

Liveness Probe Response Time

I also checked the response time, turning on garbage collection tracing and using a simple script to repeatedly check the liveness probe endpoint. I tested against cluster kevin-probe-test which has 50,000 Group resources on it.

Here is the process I followed:

Run ./check-response-time.sh or ./check-response-time.sh -t 1.0 in one terminal. (The latter command shows only when response time is greater than 1.0 seconds.)
In a second terminal, run npm run plugins.
Drive load to the backend by loading https://localhost:9000/multicloud/credentials in several tabs. Repeatedly refresh the tabs to drive new full loads of the SSE stream.

Using a baseline test branch vs. a patched test branch, I found:

Without the patch, I can easily provoke response times of 1-2 seconds. I saw a particularly long pause of 53 s on one occasion.
With the patch, I used ./check-response-time.sh -t 0.1 and only saw response times up to just under 0.5 s.

coderabbitai · 2026-06-15T20:14:23Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository: stolostron/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 22cb4710-14e7-4a7e-b81a-14f069d95529

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…oming network requests Assisted-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>

KevinFCormier · 2026-06-19T15:51:27Z

/cc @fxiang1

Generated-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>

KevinFCormier · 2026-06-19T16:26:38Z

/test unit-tests-sonarcloud

fxiang1 · 2026-06-19T16:26:56Z

/lgtm

I put this on hold, not sure if the 2.16 stream is open.

openshift-ci · 2026-06-19T16:27:05Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fxiang1, KevinFCormier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [KevinFCormier,fxiang1]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sonarqubecloud · 2026-06-19T16:59:15Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
95.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

openshift-ci Bot added do-not-merge/work-in-progress dco-signoff: yes labels Jun 15, 2026

openshift-ci Bot added the approved label Jun 15, 2026

Break up large loops to avoid saturating event queue and blocking inc…

2be6c88

…oming network requests Assisted-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>

KevinFCormier changed the title ~~WIP ACM-35158 avoid blocking incoming requests [release-2.16]~~ ACM-35158 avoid blocking incoming requests [release-2.16] Jun 19, 2026

openshift-ci Bot removed the do-not-merge/work-in-progress label Jun 19, 2026

openshift-ci Bot requested a review from fxiang1 June 19, 2026 15:51

Add test coverage

d1b5b32

Generated-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>

KevinFCormier force-pushed the ACM-35158-avoid-blocking-incoming-requests branch from b975c63 to d1b5b32 Compare June 19, 2026 15:52

fxiang1 added the do-not-merge/hold label Jun 19, 2026

openshift-ci Bot assigned fxiang1 Jun 19, 2026

openshift-ci Bot added the lgtm label Jun 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACM-35158 avoid blocking incoming requests [release-2.16]#6339

ACM-35158 avoid blocking incoming requests [release-2.16]#6339
KevinFCormier wants to merge 2 commits into
stolostron:release-2.16from
KevinFCormier:ACM-35158-avoid-blocking-incoming-requests

KevinFCormier commented Jun 15, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading

Review skipped

Uh oh!

KevinFCormier commented Jun 19, 2026

Uh oh!

KevinFCormier commented Jun 19, 2026

Uh oh!

fxiang1 commented Jun 19, 2026

Uh oh!

openshift-ci Bot commented Jun 19, 2026

Uh oh!

sonarqubecloud Bot commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

KevinFCormier commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Summary

✅ Checklist

General

If Bugfix

🗒️ Notes for Reviewers

Memory Usage

Before

After

Liveness Probe Response Time

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

KevinFCormier commented Jun 19, 2026

Uh oh!

KevinFCormier commented Jun 19, 2026

Uh oh!

fxiang1 commented Jun 19, 2026

Uh oh!

openshift-ci Bot commented Jun 19, 2026

Uh oh!

sonarqubecloud Bot commented Jun 19, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KevinFCormier commented Jun 15, 2026 •

edited

Loading

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading