-
Notifications
You must be signed in to change notification settings - Fork 104
feat(spanner): add ClientContext support to options #1495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This change adds support for ClientContext in Options and ensures it is propagated to ExecuteSql, Read, Commit, and BeginTransaction requests. It aligns with go/spanner-client-scoped-session-state design. ClientContext allows passing opaque, RPC-scoped side-channel information (like application-level user context) to Spanner. This implementation supports setting ClientContext at the Client, Database, and Request levels, with request-level options taking precedence. Key changes: - Added ClientContext to types/spanner.py and exposed it. - Updated Client.__init__ to accept a default client_context. - Added helpers for merging ClientContext with correct precedence. - Updated Snapshot, Transaction, Batch, and Database wrappers to propagate the context. - Added comprehensive unit tests in tests/unit/test_client_context.py.
Summary of ChangesHello @aseering, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for ClientContext throughout the Spanner client library, enabling its propagation to various requests. However, a critical flaw was identified in the _merge_client_context helper function, which modifies the base context object in place. This can lead to cross-request context leakage, potentially resulting in unauthorized data access or privilege escalation if the context is used for security-sensitive decisions. Additionally, there are suggestions to enhance code quality, such as replacing type() is dict checks with isinstance() for type validation and a refactoring suggestion to improve logic clarity.
| combined = base or ClientContext() | ||
| if type(combined) is dict: | ||
| combined = ClientContext(combined) | ||
|
|
||
| merge = merge or ClientContext() | ||
| if type(merge) is dict: | ||
| merge = ClientContext(merge) | ||
|
|
||
| type(combined).pb(combined).MergeFrom(type(merge).pb(merge)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _merge_client_context function critically modifies the base object in place, leading to cross-request context leakage. This occurs because combined = base or ClientContext() creates a reference to the base object, and subsequent operations modify the underlying protobuf message of the base object. This in-place modification causes client_context from one request to leak into all subsequent requests using the same Client instance.
This is a high-severity vulnerability as ClientContext is used for SECURE_CONTEXT() calls in Spanner, which are vital for row-level security or authorization decisions in Secure Views. Such leakage can result in unauthorized data access or privilege escalation. The provided code suggestion addresses this by ensuring proper merging without in-place modification, and also incorporates robust type checking using isinstance().
| combined = base or ClientContext() | |
| if type(combined) is dict: | |
| combined = ClientContext(combined) | |
| merge = merge or ClientContext() | |
| if type(merge) is dict: | |
| merge = ClientContext(merge) | |
| type(combined).pb(combined).MergeFrom(type(merge).pb(merge)) | |
| combined_pb = ClientContext()._pb | |
| if base: | |
| combined_pb.MergeFrom(ClientContext(base)._pb if isinstance(base, dict) else base._pb) | |
| if merge: | |
| combined_pb.MergeFrom(ClientContext(merge)._pb if isinstance(merge, dict) else merge._pb) | |
| combined = ClientContext(combined_pb) |
| if type(merge) is dict: | ||
| merge = ClientContext(merge) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| elif type(request_options) is dict: | ||
| request_options = RequestOptions(request_options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For type checking, it's generally better to use isinstance(obj, cls) rather than type(obj) is cls. isinstance correctly handles subclasses, making the code more robust.
| elif type(request_options) is dict: | |
| request_options = RequestOptions(request_options) | |
| elif isinstance(request_options, dict): | |
| request_options = RequestOptions(request_options) |
| if client_context is not None: | ||
| if type(client_context) is dict: | ||
| client_context = ClientContext(client_context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For type checking, it's generally better to use isinstance(obj, cls) rather than type(obj) is cls. isinstance correctly handles subclasses, making the code more robust.
| if client_context is not None: | |
| if type(client_context) is dict: | |
| client_context = ClientContext(client_context) | |
| if isinstance(client_context, dict): | |
| client_context = ClientContext(client_context) | |
| elif not isinstance(client_context, ClientContext): |
| if client_context is not None: | ||
| if type(client_context) is dict: | ||
| client_context = ClientContext(client_context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For type checking, it's generally better to use isinstance(obj, cls) rather than type(obj) is cls. isinstance correctly handles subclasses, making the code more robust.
| if client_context is not None: | |
| if type(client_context) is dict: | |
| client_context = ClientContext(client_context) | |
| if isinstance(client_context, dict): | |
| client_context = ClientContext(client_context) | |
| elif not isinstance(client_context, ClientContext): |
| experimental_host=None, | ||
| disable_builtin_metrics=False, | ||
| client_context=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For type checking, it's generally better to use isinstance(obj, cls) rather than type(obj) is cls. isinstance correctly handles subclasses, making the code more robust. This pattern of using type() is dict appears in several places in this PR and should be updated for consistency.
| experimental_host=None, | |
| disable_builtin_metrics=False, | |
| client_context=None, | |
| if isinstance(client_context, dict): | |
| client_context = ClientContext(client_context) | |
| elif not isinstance(client_context, ClientContext): |
| if type(client_context) is dict: | ||
| client_context = ClientContext(client_context) | ||
| elif not isinstance(client_context, ClientContext): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For type checking, it's generally better to use isinstance(obj, cls) rather than type(obj) is cls. isinstance correctly handles subclasses, making the code more robust.
| if type(client_context) is dict: | |
| client_context = ClientContext(client_context) | |
| elif not isinstance(client_context, ClientContext): | |
| if isinstance(client_context, dict): | |
| client_context = ClientContext(client_context) | |
| elif not isinstance(client_context, ClientContext): |
| client_context = _merge_client_context( | ||
| database._instance._client._client_context, self._client_context | ||
| ) | ||
| if client_context: | ||
| begin_request_kwargs["request_options"] = _merge_request_options( | ||
| begin_request_kwargs.get("request_options"), client_context | ||
| ) | ||
|
|
||
| if transaction_tag: | ||
| request_options = begin_request_kwargs.get("request_options") | ||
| if request_options is None: | ||
| request_options = RequestOptions(transaction_tag=transaction_tag) | ||
| else: | ||
| request_options.transaction_tag = transaction_tag | ||
| begin_request_kwargs["request_options"] = request_options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for creating and updating request_options with client_context and transaction_tag can be simplified for better readability and maintainability. Consider building the request_options object in a more linear fashion before assigning it to begin_request_kwargs.
| client_context = _merge_client_context( | |
| database._instance._client._client_context, self._client_context | |
| ) | |
| if client_context: | |
| begin_request_kwargs["request_options"] = _merge_request_options( | |
| begin_request_kwargs.get("request_options"), client_context | |
| ) | |
| if transaction_tag: | |
| request_options = begin_request_kwargs.get("request_options") | |
| if request_options is None: | |
| request_options = RequestOptions(transaction_tag=transaction_tag) | |
| else: | |
| request_options.transaction_tag = transaction_tag | |
| begin_request_kwargs["request_options"] = request_options | |
| request_options = begin_request_kwargs.get("request_options") | |
| client_context = _merge_client_context( | |
| database._instance._client._client_context, self._client_context | |
| ) | |
| request_options = _merge_request_options(request_options, client_context) | |
| if transaction_tag: | |
| if request_options is None: | |
| request_options = RequestOptions() | |
| request_options.transaction_tag = transaction_tag | |
| if request_options: | |
| begin_request_kwargs["request_options"] = request_options |
This change adds support for ClientContext in Options and ensures it is propagated to ExecuteSql, Read, Commit, and BeginTransaction requests. It aligns with go/spanner-client-scoped-session-state design.
ClientContext allows passing opaque, RPC-scoped side-channel information (like application-level user context) to Spanner. This implementation supports setting ClientContext at the Client, Database, and Request levels, with request-level options taking precedence.
Key changes:
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕