Skip to content

feat(collector): implement panic recovery and crash-loop safety#157

Open
wazer24 wants to merge 5 commits into
optiqor:mainfrom
wazer24:feature/panic-recovery
Open

feat(collector): implement panic recovery and crash-loop safety#157
wazer24 wants to merge 5 commits into
optiqor:mainfrom
wazer24:feature/panic-recovery

Conversation

@wazer24
Copy link
Copy Markdown

@wazer24 wazer24 commented May 31, 2026

What

Addresses PR review feedback: replaces brittle string-sniffing exit-code logic with a typed sentinel error, sweeps remaining interface{} back to any, adds missing doc comments for the no-restart contract and unbounded panic logs, and fixes CI failures (govet shadow, missing import close paren, linter pre-build step).

Why

Fixes #54 review comments from @btwshivam

How

  • Typed exit-code error: Introduced cli.ErrDaemonPanic sentinel; runStart assigns it via named return on panic recovery. cmd/kerno/main.go now uses errors.Is(err, cli.ErrDaemonPanic) instead of string-matching err.Error()[:6] == "daemon".
  • interface{}any: Reverted map[string]interface{} back to map[string]any in healthzHandler; changed r interface{} to r any in HandlePanic and HandleDaemonPanic.
  • Doc comments: Added no-restart contract note on RunSafeCollectorGoroutine; added operator-responsible note for unbounded panic log files on HandlePanic.
  • CI fixes: Renamed shadowed err to rootErr in runStart (govet); restored missing ) after import block; added go build ./... pre-build step before golangci-lint in ci.yml.

Testing

  • go build ./... passes
  • go test ./... passes
  • go vet ./... passes
  • golangci-lint run ./... passes
  • Tested locally with: N/A — no behavioral changes, only error-typing and doc fixes

  • N/A — pure docs/refactor (no BPF/collector/doctor logic changes)

Checklist

  • PR title follows Conventional Commits (feat(core): implement panic recovery and crash-loop safety)
  • All commits are DCO-signed (git commit -s)
  • No unrelated changes pulled in
  • Documentation updated where user-visible behavior changed
  • Added/updated tests for new code paths
  • If a new doctor rule, paired with a chaos scenario — N/A

wazer24 added 2 commits May 31, 2026 09:25
Signed-off-by: wazer24 <24wazer@rbunagpur.in>
Signed-off-by: wazer24 <24wazer@rbunagpur.in>
@wazer24 wazer24 requested a review from btwshivam as a code owner May 31, 2026 04:01
@github-actions github-actions Bot added the level:critical Touches BPF, security, or release surfaces (auto-applied) label May 31, 2026
@github-actions
Copy link
Copy Markdown

🚀 First PR — welcome aboard!

A few things to expect:

  1. CI: every PR runs build + race tests + lint + (eventually) the kernel matrix. If something fails, the log will tell you exactly which gate.
  2. DCO: every commit needs Signed-off-by:git commit -s adds it automatically.
  3. Conventional Commits: PR titles like feat(doctor): add new rule or fix(bpf): handle X. We squash-merge by default.
  4. Review: a maintainer will review within 72 hours. Suggestions are conversations, not orders — push back if something doesn't fit your context.

If you get stuck, reply here or jump to Discussions. We want this PR to land.

@github-actions github-actions Bot added documentation Improvements or additions to documentation testing Tests and test coverage area/bpf eBPF programs and loaders area/doctor Diagnostic engine and rules area/ops Operations, deployment, runtime ergonomics area/community Community, contributors, governance labels May 31, 2026
@wazer24 wazer24 closed this May 31, 2026
@wazer24 wazer24 deleted the feature/panic-recovery branch May 31, 2026 04:02
@wazer24 wazer24 restored the feature/panic-recovery branch May 31, 2026 04:08
@wazer24 wazer24 reopened this May 31, 2026
Copy link
Copy Markdown
Member

@btwshivam btwshivam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate of #159 (same 17 files), and lint plus conventional-title are red here. let's land #159 instead, please close this one. thanks.

@wazer24 wazer24 changed the title Feature/panic recovery feat(core): implement panic recovery and crash-loop safety Jun 7, 2026
@wazer24 wazer24 changed the title feat(core): implement panic recovery and crash-loop safety feat(collector): implement panic recovery and crash-loop safety Jun 7, 2026
@wazer24
Copy link
Copy Markdown
Author

wazer24 commented Jun 7, 2026

@btwshivam apologies for the delay but i fixed the remaining issues the solution of mine was having please review it,and tell me if anything needed to be changed thankyou

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/bpf eBPF programs and loaders area/community Community, contributors, governance area/doctor Diagnostic engine and rules area/ops Operations, deployment, runtime ergonomics documentation Improvements or additions to documentation level:critical Touches BPF, security, or release surfaces (auto-applied) testing Tests and test coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants