Skip to content

v3rpc: return Unavailable instead of Unknown for ErrNotPrimary#21684

Open
blackwell-systems wants to merge 2 commits into
etcd-io:mainfrom
blackwell-systems:fix-errnotprimary-grpc-code
Open

v3rpc: return Unavailable instead of Unknown for ErrNotPrimary#21684
blackwell-systems wants to merge 2 commits into
etcd-io:mainfrom
blackwell-systems:fix-errnotprimary-grpc-code

Conversation

@blackwell-systems
Copy link
Copy Markdown

When LeaseRenew returns lease.ErrNotPrimary (e.g. after a leader switch), the error was not mapped in toGRPCErrorMap, causing togRPCError to fall through to status.Error(codes.Unknown, err.Error()). Clients seeing Unknown (code 2) do not know to retry.

This adds ErrGRPCLeaseNotPrimary with codes.Unavailable (14) and maps lease.ErrNotPrimary to it, so clients receive a retryable status code. This is consistent with how ErrLeaderChanged and ErrNoLeader are already mapped to codes.Unavailable.

Changes:

  • api/v3rpc/rpctypes/error.go: add ErrGRPCLeaseNotPrimary, ErrLeaseNotPrimary, and the errStringToError mapping
  • server/etcdserver/api/v3rpc/util.go: add lease.ErrNotPrimary to toGRPCErrorMap
  • server/etcdserver/api/v3rpc/util_test.go: add test case

Tests pass:

ok  go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc (TestGRPCError)
ok  go.etcd.io/etcd/api/v3/v3rpc/rpctypes (TestConvert)

Fixes #21671

@k8s-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: blackwell-systems
Once this PR has been reviewed and has the lgtm label, please assign siyuanfoundation for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Copy Markdown

Hi @blackwell-systems. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@serathius
Copy link
Copy Markdown
Member

/ok-to-test

@serathius
Copy link
Copy Markdown
Member

Please provide a test

@blackwell-systems
Copy link
Copy Markdown
Author

Added TestLeaseNotPrimaryReturnsUnavailable in 53cf384. It explicitly asserts the gRPC status code is codes.Unavailable, verifying that clients can distinguish this transient leadership error from permanent codes.Unknown failures and retry accordingly.

The existing TestGRPCError table also has a row for the mapping (lease.ErrNotPrimaryrpctypes.ErrGRPCLeaseNotPrimary).

When LeaseRenew returns lease.ErrNotPrimary (e.g. after a leader
switch), the error was not mapped in toGRPCErrorMap, causing
togRPCError to return codes.Unknown (2). Clients seeing Unknown
do not know to retry.

Add ErrGRPCLeaseNotPrimary with codes.Unavailable and map
lease.ErrNotPrimary to it, so clients receive a retryable status
code.

Fixes etcd-io#21671

Signed-off-by: Dayna Blackwell <daynajblackwell@gmail.com>
Signed-off-by: Dayna Blackwell <blackwellsystems@protonmail.com>
Adds TestLeaseNotPrimaryReturnsUnavailable which verifies the gRPC
status code is codes.Unavailable, ensuring clients can distinguish
transient leadership errors from permanent failures and retry.

Signed-off-by: Dayna Blackwell <blackwellsystems@protonmail.com>
@blackwell-systems blackwell-systems force-pushed the fix-errnotprimary-grpc-code branch from 53cf384 to 79f4b60 Compare April 29, 2026 07:14
@blackwell-systems
Copy link
Copy Markdown
Author

/retest

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.24%. Comparing base (31ba53a) to head (79f4b60).

Additional details and impacted files
Files with missing lines Coverage Δ
api/v3rpc/rpctypes/error.go 90.47% <ø> (ø)
server/etcdserver/api/v3rpc/util.go 67.74% <ø> (ø)

... and 29 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #21684      +/-   ##
==========================================
+ Coverage   69.09%   69.24%   +0.14%     
==========================================
  Files         432      432              
  Lines       35544    35544              
==========================================
+ Hits        24560    24613      +53     
+ Misses       9595     9544      -51     
+ Partials     1389     1387       -2     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 31ba53a...79f4b60. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Unexpected Unknown grpc code for ErrNotPrimary

3 participants