Skip to content

fix: guard nil debug request in bulk check#3174

Open
ivanauth wants to merge 2 commits into
authzed:mainfrom
ivanauth:fix/issue-3159-bulkcheck-nil-debug
Open

fix: guard nil debug request in bulk check#3174
ivanauth wants to merge 2 commits into
authzed:mainfrom
ivanauth:fix/issue-3159-bulkcheck-nil-debug

Conversation

@ivanauth

Copy link
Copy Markdown
Contributor

CheckBulkPermissions panics when a DebugInformation entry with a nil
Check.Request reaches the resource-ID matching loop. This happens when a
tracing-enabled request shares a singleflight dispatch with a non-tracing
bulk check call — the combined metadata node produced by
combineResponseMetadata never sets Check.Request.

Add a nil guard to skip such entries.

Closes #3159

A DebugInformation entry with a nil Check.Request can reach a
non-tracing CheckBulkPermissions caller when an identical
tracing-enabled request shares its dispatch, causing a nil pointer
dereference. Skip such entries when matching debug info by resource.
@ivanauth ivanauth requested a review from a team as a code owner June 11, 2026 16:48
@github-actions github-actions Bot added area/api v1 Affects the v1 API area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Jun 11, 2026
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

josephschorr
josephschorr previously approved these changes Jun 11, 2026

@josephschorr josephschorr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@miparnisari miparnisari left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this solution? The bug report said

Our CheckPermission calls have with_tracing: true but CheckBulkPermissions has with_tracing: false

how does this translate to if di.Check == nil || di.Check.Request == nil? if i understood correctly, in this particular issue, they would NOT be nil?

You also said

the combined metadata node produced by combineResponseMetadata never sets Check.Request.

is this not the root cause then? Why are we not fixing that?

@ivanauth ivanauth force-pushed the fix/issue-3159-bulkcheck-nil-debug branch from 52ea926 to 1759b8f Compare June 11, 2026 19:31
@github-actions github-actions Bot added the area/dispatch Affects dispatching of requests label Jun 11, 2026
The dispatch/singleflight key does not depend on req.Debug, so a
tracing-enabled check and an otherwise-identical non-tracing check
collapse into one flight. The shared response is cloned back to both
callers, which caused two bugs from the same root cause:

  - A tracing leader leaked its debug trace into a non-tracing follower.
    The trace contains combined nodes (Check.Request == nil, synthesized
    by combineResponseMetadata), which panicked the CheckBulkPermissions
    resource-ID matching loop. (authzed#3159)
  - A non-tracing leader denied a tracing follower its trace entirely,
    making with_tracing nondeterministic under concurrent load.

Dispatch debug-enabled checks directly instead of sharing a flight. The
dispatch cache key is unchanged, so cache sharing across debug levels is
preserved. The bulk loop keeps a nil-Request guard as defense-in-depth,
since combineResponseMetadata legitimately produces such nodes.
@ivanauth ivanauth force-pushed the fix/issue-3159-bulkcheck-nil-debug branch from 1759b8f to 541b949 Compare June 11, 2026 20:26
@ivanauth

Copy link
Copy Markdown
Contributor Author

Thank you for the feedback!

di.Check.Request actually is nil in this case. The node reaching that loop isn't the original per-resource trace node (those do have Request set). It's the wrapper node created by combineResponseMetadata, which intentionally sets fields like TraceId, SourceId, and SubProblems but leaves Request nil. The real trace entries are nested under SubProblems. The panic is consistent with that: slices.Contains(di.Check.Request.ResourceIds, ...) can only nil-deref if di.Check.Request is nil.

I don't think combineResponseMetadata is the bug. A combined node represents multiple sub-checks, so there isn't a single request to attach to it. The code also relies on Request == nil when deciding whether to flatten subproblems, so populating it would change that behavior.

The actual issue is one layer up: the singleflight key doesn't include the debug flag. That allows a with_tracing=true CheckPermission request and a with_tracing=false CheckBulkPermissions request to share the same flight, which can leak debug metadata into the bulk response.

I've reworked the fix so debug-enabled checks no longer share a singleflight flight and instead dispatch directly. That prevents debug metadata from leaking into non-debug requests and fixes the panic at the source. It also fixes the reverse case, where a tracing request could lose trace data because it shared a flight with a non-tracing request.

I kept the nil check as a defensive guard since Request == nil is still a valid shape for combined metadata nodes.

Happy to adjust the approach if you'd prefer it handled differently or close this PR completely since it's getting too complex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api v1 Affects the v1 API area/dispatch Affects dispatching of requests area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CheckBulkPermissions panics with nil pointer dereference in bulkcheck.go:189 under concurrent load

3 participants