Skip to content

release-0.48: apps: bump gatekeeper to v3.18.3#2782

Open
kristiangronas wants to merge 1 commit intoelastisys:release-0.48from
railcp:release-0.48
Open

release-0.48: apps: bump gatekeeper to v3.18.3#2782
kristiangronas wants to merge 1 commit intoelastisys:release-0.48from
railcp:release-0.48

Conversation

@kristiangronas
Copy link
Copy Markdown
Contributor

Warning

This is a public repository, ensure not to disclose:

  • personal data beyond what is necessary for interacting with this pull request, nor
  • business confidential information, such as customer names.

What kind of PR is this?

Required: Mark one of the following that is applicable:

  • kind/feature
  • kind/improvement
  • kind/deprecation
  • kind/documentation
  • kind/clean-up
  • kind/bug
  • kind/other

Optional: Mark one or more of the following that are applicable:

Important

Breaking changes should be marked kind/admin-change or kind/dev-change depending on type
Critical security fixes should be marked with kind/security

  • kind/admin-change
  • kind/dev-change
  • kind/security
  • [kind/adr](set-me)

What does this PR do / why do we need this PR?

open-policy-agent/gatekeeper#3743 would sometimes cause gatekeeper to crash when running helmfile

  • Fixes #

Information to reviewers

Checklist

  • Proper commit message prefix on all commits
  • Change checks:
    • The change is transparent
    • The change is disruptive
    • The change requires no migration steps
    • The change requires migration steps
    • The change updates CRDs
    • The change updates the config and the schema
  • Documentation checks:
  • Metrics checks:
    • The metrics are still exposed and present in Grafana after the change
    • The metrics names didn't change (Grafana dashboards and Prometheus alerts required no updates)
    • The metrics names did change (Grafana dashboards and Prometheus alerts required an update)
  • Logs checks:
    • The logs do not show any errors after the change
  • PodSecurityPolicy checks:
    • Any changed Pod is covered by Kubernetes Pod Security Standards
    • Any changed Pod is covered by Gatekeeper Pod Security Policies
    • The change does not cause any Pods to be blocked by Pod Security Standards or Policies
  • NetworkPolicy checks:
    • Any changed Pod is covered by Network Policies
    • The change does not cause any dropped packets in the NetworkPolicy Dashboard
  • Audit checks:
    • The change does not cause any unnecessary Kubernetes audit events
    • The change requires changes to Kubernetes audit policy
  • Falco checks:
    • The change does not cause any alerts to be generated by Falco
  • Bug checks:
    • The bug fix is covered by regression tests

@kristiangronas kristiangronas requested a review from a team as a code owner October 6, 2025 11:52
@davidumea davidumea requested a review from aarnq October 6, 2025 12:00
@aarnq
Copy link
Copy Markdown
Contributor

aarnq commented Oct 6, 2025

Hi @kristiangronas, thank you for contributions!
Unfortunate that we didn't see this before the v0.49.0 was created.

Could you move this to fix it on main instead? I'll be sure to add tasks for us to create patch releases with the fix for our supported versions.

Also as a workaround since the introduction of the image list in v0.48 you can override the image version of Gatekeeper through config until it is in a release:

# common-config.yaml

images:
  gatekeeper:
    image: docker.io/openpolicyagent/gatekeeper:v3.18.3
    preInstallCRDs: docker.io/openpolicyagent/gatekeeper-crds:v3.18.3
    postInstallLabelNamespace: docker.io/openpolicyagent/gatekeeper-crds:v3.18.3

@kristiangronas
Copy link
Copy Markdown
Contributor Author

For main/0.50 i wanted to upgrade to 3.20 or 3.19, does that make sense?

@aarnq
Copy link
Copy Markdown
Contributor

aarnq commented Oct 6, 2025

For main/0.50 i wanted to upgrade to 3.20 or 3.19, does that make sense?

Yes, but please do the minor bump first, we'll reference the commit.

@kristiangronas
Copy link
Copy Markdown
Contributor Author

Looking closer they never backported open-policy-agent/gatekeeper@266f7b0 which is what i'm actually hitting (helm is deleting the secret containing the oldest release), although apparently it's not directly crashing the pod, but it was running out of memory when it happened

So bumping to 3.18.3 is probably not needed after all, it's fixed in 3.19.3 at least, but that should probably go in 0.50. Otherwise maybe we should increase the memory limit or play with GOMEMLIMIT? Have you been seeing gatekeeper run out of memory? (i was running helmfile sync a lot)

@aarnq
Copy link
Copy Markdown
Contributor

aarnq commented Oct 7, 2025

Looking closer they never backported open-policy-agent/gatekeeper@266f7b0 which is what i'm actually hitting (helm is deleting the secret containing the oldest release), although apparently it's not directly crashing the pod, but it was running out of memory when it happened

So bumping to 3.18.3 is probably not needed after all, it's fixed in 3.19.3 at least, but that should probably go in 0.50. Otherwise maybe we should increase the memory limit or play with GOMEMLIMIT? Have you been seeing gatekeeper run out of memory? (i was running helmfile sync a lot)

Yes on the memory limit for the controller, for larger environments and a lot of actions that revolve around the use of Gatekeepers CRDs and resources that is subject to them.
As Apps contains a fair amount in both categories, frequent reruns of sync will generate a lot of work and increase memory usage of both the control plane and Gatekeeper controllers to manage it.
We have not afaik played with the GOMEMLIMIT for Gatekeeper.

@AlbinB97 AlbinB97 mentioned this pull request Oct 15, 2025
35 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants