fix: guard nil debug request in bulk check#3174
Conversation
A DebugInformation entry with a nil Check.Request can reach a non-tracing CheckBulkPermissions caller when an identical tracing-enabled request shares its dispatch, causing a nil pointer dereference. Skip such entries when matching debug info by resource.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Can you explain this solution? The bug report said
Our CheckPermission calls have with_tracing: true but CheckBulkPermissions has with_tracing: false
how does this translate to if di.Check == nil || di.Check.Request == nil? if i understood correctly, in this particular issue, they would NOT be nil?
You also said
the combined metadata node produced by combineResponseMetadata never sets Check.Request.
is this not the root cause then? Why are we not fixing that?
52ea926 to
1759b8f
Compare
The dispatch/singleflight key does not depend on req.Debug, so a
tracing-enabled check and an otherwise-identical non-tracing check
collapse into one flight. The shared response is cloned back to both
callers, which caused two bugs from the same root cause:
- A tracing leader leaked its debug trace into a non-tracing follower.
The trace contains combined nodes (Check.Request == nil, synthesized
by combineResponseMetadata), which panicked the CheckBulkPermissions
resource-ID matching loop. (authzed#3159)
- A non-tracing leader denied a tracing follower its trace entirely,
making with_tracing nondeterministic under concurrent load.
Dispatch debug-enabled checks directly instead of sharing a flight. The
dispatch cache key is unchanged, so cache sharing across debug levels is
preserved. The bulk loop keeps a nil-Request guard as defense-in-depth,
since combineResponseMetadata legitimately produces such nodes.
1759b8f to
541b949
Compare
|
Thank you for the feedback!
I don't think The actual issue is one layer up: the singleflight key doesn't include the debug flag. That allows a I've reworked the fix so debug-enabled checks no longer share a singleflight flight and instead dispatch directly. That prevents debug metadata from leaking into non-debug requests and fixes the panic at the source. It also fixes the reverse case, where a tracing request could lose trace data because it shared a flight with a non-tracing request. I kept the nil check as a defensive guard since Happy to adjust the approach if you'd prefer it handled differently or close this PR completely since it's getting too complex. |
CheckBulkPermissions panics when a DebugInformation entry with a nil
Check.Request reaches the resource-ID matching loop. This happens when a
tracing-enabled request shares a singleflight dispatch with a non-tracing
bulk check call — the combined metadata node produced by
combineResponseMetadata never sets Check.Request.
Add a nil guard to skip such entries.
Closes #3159