Skip to content

feat: Add a script for debugging Memgraph using ephemeral containers#203

Merged
as51340 merged 2 commits intomainfrom
feat/gdb-side-car
Feb 27, 2026
Merged

feat: Add a script for debugging Memgraph using ephemeral containers#203
as51340 merged 2 commits intomainfrom
feat/gdb-side-car

Conversation

@as51340
Copy link
Collaborator

@as51340 as51340 commented Feb 25, 2026

What

Add a convenience script scripts/debug-memgraph.sh that attaches a GDB debug container to a running Memgraph HA pod using kubectl debug ephemeral containers. The script
auto-detects the target container name (memgraph-data or memgraph-coordinator) from the pod name, creates a temporary custom profile to override the pod's non-root
securityContext, and attaches GDB with -ex continue so the process keeps running until a crash signal is caught. Also adds GDB debugging instructions to
charts/memgraph-high-availability/templates/NOTES.txt.

Why

When debugging Memgraph crashes caused by specific queries in HA deployments, developers need to attach GDB to the running process. Restarting pods is undesirable because
Memgraph recovery can be slow for large datasets. This script wraps the kubectl debug workflow and handles the non-trivial security context override needed because Memgraph
pods run as non-root (uid 101) with runAsNonRoot: true.

How

  • The HA chart's pod-level securityContext (runAsUser: 101, runAsNonRoot: true) prevents ephemeral containers from running as root, which blocks both apt-get install and
    ptrace. The script generates a temporary --custom profile JSON that sets runAsUser: 0 on the ephemeral container only, used together with --profile=sysadmin for full
    SYS_PTRACE capability.
  • GDB attaches with -ex continue so the Memgraph process is not paused — it keeps running and GDB only breaks on crash signals (SIGSEGV, SIGABRT, etc.).
  • No chart template changes are needed for the core functionality — kubectl debug --target handles PID namespace sharing without requiring shareProcessNamespace on the pod
    spec.

Files changed:

  • scripts/debug-memgraph.sh — New convenience wrapper script

Testing

  • Deployed HA chart on AKS cluster and verified the script attaches successfully
  • Confirmed ephemeral container runs as root and can install packages / use ptrace
  • Verified pgrep -x memgraph finds the process PID through shared PID namespace
  • Verified GDB attaches and catches SIGSEGV sent via kubectl exec <pod> -c memgraph-data -- kill -SIGSEGV <PID>
  • Verified the Memgraph pod is never restarted during the debug session
  • No new helm tests added — this is a standalone script that doesn't affect chart rendering or deployment

Notes for reviewers

  • Requires kubectl 1.32+ (the --custom flag for kubectl debug went GA in 1.32). Older kubectl versions will fail with an unknown flag error.
  • The apt-get install gdb step inside the ephemeral container adds ~30s latency before GDB attaches. A follow-up could add a pre-built debug image with GDB already installed,
    or a gdbserver sidecar approach for instant attach.
  • The script cleans up the temporary custom profile JSON via trap ... EXIT.

@as51340 as51340 self-assigned this Feb 25, 2026
@as51340 as51340 added memgraph-ha Memgraph HA chart issue enhancement New feature or request labels Feb 26, 2026
@as51340 as51340 marked this pull request as ready for review February 26, 2026 07:51
@as51340 as51340 changed the title feat: Add a script for debugging Memgraph feat: Add a script for debugging Memgraph using ephemeral containers Feb 26, 2026
@gitbuda
Copy link
Member

gitbuda commented Feb 26, 2026

If you haven't alrady started, make sure to update/expand https://memgraph.com/docs/database-management/debugging

@as51340 as51340 merged commit 5197971 into main Feb 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request memgraph-ha Memgraph HA chart issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants