WIP ACM-35158 Improve readiness/liveness probe behaviour [release-2.16]#6333
Conversation
…orm (stolostron#6314) * Add Azure and generic hcp destroy commands Generated-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com> * Add unit testing Generated-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com> * Update AWS command to clarify creds are from a file Signed-off-by: Kevin Cormier <kcormier@redhat.com> --------- Signed-off-by: Kevin Cormier <kcormier@redhat.com> Co-authored-by: Kevin Cormier <kcormier@redhat.com>
…) grows unbounded causing memory leak (stolostron#6319) * Add cleanupAccessCache function Signed-off-by: Oksana Bazylieva <obazylie@redhat.com> * Add tests Signed-off-by: Oksana Bazylieva <obazylie@redhat.com> * coderabbitai fix Signed-off-by: Oksana Bazylieva <obazylie@redhat.com> --------- Signed-off-by: Oksana Bazylieva <obazylie@redhat.com> Co-authored-by: Oksana Bazylieva <obazylie@redhat.com>
Signed-off-by: John Swanke <jswanke@redhat.com>
…oming network requests Assisted-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository: stolostron/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: KevinFCormier The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
edfadbd to
3a2908f
Compare
Assisted-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>
Generated-by: Cursor (Claude Opus 4.6 High) Signed-off-by: Kevin Cormier <kcormier@redhat.com>
3a2908f to
061b0d4
Compare
|
|
@KevinFCormier: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
ac37c85 to
65a38dd
Compare


📝 Summary
Ticket Summary (Title):
console-mce pod CrashLoopBackOff due to probe failure with large resource sets
Ticket Link:
https://redhat.atlassian.net/browse/ACM-35158
Type of Change:
✅ Checklist
General
ACM-12340 Fix bug with...)If Bugfix
🗒️ Notes for Reviewers
Customer was observing CrashLoopBackoff on console-mce pods and had over 20,000 Group resources present. Initially processing these resources may have blocked us from responding to the readiness and liveness probes on time, causing Kubernetes to restart the pod.
This PR batches Promise.all calls so that we call setImmediate after each batch of 100 to yield the event queue so that new requests can be handled. This seems to smooth out memory usage of the pods. Before this change, I observed a spike at startup, then a retreat to stable size.
This PR also separates the liveness and readiness probes so that we can delay marking the pod as ready for traffic until after initial loading of the watched resources and aggregated application resources has completed.