Skip to content

fix(rpc): stop classifying expected gRPC responses as trace errors#1812

Merged
Mirko-von-Leipzig merged 1 commit intonextfrom
sergerad-api-false-errors
Mar 20, 2026
Merged

fix(rpc): stop classifying expected gRPC responses as trace errors#1812
Mirko-von-Leipzig merged 1 commit intonextfrom
sergerad-api-false-errors

Conversation

@sergerad
Copy link
Collaborator

Problem

On devnet we regularly observe a spike of errors in our traces, all originating from tower_http::trace::on_failure. TraceLayer::new_for_grpc() classifier treats every non-Ok gRPC status code as a failure, which emits ERROR-level trace events. The errors include:

  • RESOURCE_EXHAUSTED (code 8) — rate limiting working as intended
  • UNIMPLEMENTED (code 12) — security scanners probing non-existent RPC methods
  • UNKNOWN (code 2) — scanners probing paths like .env, .git/config, .aws/credentials

None of these are actual server errors — they're either expected operational behavior (rate limiting) or client/scanner misbehavior.

Solution

Replace TraceLayer::new_for_grpc() with a customized GrpcErrorsAsFailures classifier that treats client-fault and expected-operational codes as successes:

Code Name Rationale
2 Unknown Scanner probing garbage paths
3 InvalidArgument Client validation errors
5 NotFound Scanner probing non-existent resources
8 ResourceExhausted Rate limiting (expected behavior)
12 Unimplemented Scanner hitting non-existent RPC methods

These responses are still returned to clients with the same gRPC status codes — the only change is that they no longer trigger on_failure (ERROR-level) in traces, and instead flow through on_response (DEBUG-level).

Codes that remain classified as failures: Internal, Unavailable, DataLoss, DeadlineExceeded, Cancelled, PermissionDenied, Unauthenticated.

@sergerad sergerad added the no changelog This PR does not require an entry in the `CHANGELOG.md` file label Mar 19, 2026
@sergerad sergerad requested a review from drahnr March 20, 2026 02:19
Copy link
Collaborator

@Mirko-von-Leipzig Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@Mirko-von-Leipzig Mirko-von-Leipzig merged commit 2373677 into next Mar 20, 2026
18 of 19 checks passed
@Mirko-von-Leipzig Mirko-von-Leipzig deleted the sergerad-api-false-errors branch March 20, 2026 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no changelog This PR does not require an entry in the `CHANGELOG.md` file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants