Skip to content

Conversation

@keyurva
Copy link
Contributor

@keyurva keyurva commented Jan 29, 2026

  • Updates DCClient to use the new fetch_indicators method from the python library.
  • Adds USE_SEARCH_INDICATORS flag (default: False) to toggle between the new client method and the search-indicators endpoint.
    • Tested both options manually and it works as expected.
    • Will create a bug to remove search-indicators completely in the next few weeks.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @keyurva, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the indicator search mechanism within the datacommons-mcp package. It integrates a more modern and efficient fetch_indicators method from the underlying datacommons-client library. To ensure a smooth transition and allow for thorough validation, a feature flag has been implemented, providing the flexibility to switch between the new and existing search functionalities. This change aims to improve the robustness and maintainability of the search capabilities.

Highlights

  • New Indicator Search Method: The DCClient has been updated to utilize the new fetch_indicators method from the datacommons-client Python library for searching indicators.
  • Feature Flag for Search Method: A new configuration flag, USE_SEARCH_INDICATORS (defaulting to False), has been introduced. This flag allows toggling between the new fetch_indicators client method and the legacy search-indicators endpoint, enabling a controlled rollout and testing.
  • Dependency Update: The datacommons-client dependency has been updated to version 2.1.6 and is temporarily sourced from testpypi. This is a temporary measure until a production release of the package is available on the standard PyPI.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively introduces a new method for fetching indicators using the datacommons-client library, controlled by a feature flag. This is a solid strategy for migrating to a new API. The changes are well-structured and include a good test case for the new flag. My review includes a few suggestions for improvement: ensuring API response consistency by adding a potentially missing description field, refining exception handling to be more specific for better error diagnostics and to prevent sensitive data exposure, and clarifying an apparent removal of Python 3.12 support in the project's metadata.

Comment on lines +449 to +457
results.append(
{
"SV": candidate.dcid,
"CosineScore": score,
"alternate_descriptions": [sentence]
if sentence
else [],
}
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The dictionary being created here for the new _call_fetch_indicators path is missing the "description" key, which was present in the results from the old _transform_search_indicators_to_svs_format method. This will cause result.get("description") in _search_vector to return None, and the final API response will be missing descriptions for variables.

If the sentence from the metadata is intended to be the main description, it should be added here to maintain consistency with the old API path. If not, this change in the response contract should be confirmed as intentional.

Suggested change
results.append(
{
"SV": candidate.dcid,
"CosineScore": score,
"alternate_descriptions": [sentence]
if sentence
else [],
}
)
results.append(
{
"SV": candidate.dcid,
"CosineScore": score,
"description": sentence,
"alternate_descriptions": [sentence]
if sentence
else [],
}
)

)
results_map[query] = results

except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a broad Exception can hide bugs and make debugging difficult. If the datacommons-client library defines more specific exceptions for API or network errors, it would be better to catch those explicitly. A general except Exception: could then be used as a fallback for unexpected errors. This would improve error handling and observability. Additionally, when logging exceptions, avoid logging the entire exception object directly as it may contain sensitive information. Instead, log only the exception type or a sanitized message.

For example:

try:
    # ...
except datacommons.client.DCException as e:
    logger.error("API error calling fetch_indicators: %s", type(e).__name__)
except requests.exceptions.RequestException as e:
    logger.error("Network error calling fetch_indicators: %s", type(e).__name__)
except Exception as e:
    logger.error(
        "Unexpected error calling fetch_indicators for queries '%s': %s", queries, type(e).__name__
    )

(Assuming datacommons.client.DCException and requests.exceptions.RequestException are relevant exceptions)

References
  1. Do not log the entire exception object directly, as it may contain sensitive information like API keys. Instead, log only the exception type or a sanitized message to prevent sensitive data exposure.

"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The classifier for Python 3.12 support has been removed. Was this intentional? If the project no longer supports Python 3.12, this should be documented. If it was an accident, it should be restored to avoid potential compatibility issues for users on Python 3.12.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@keyurva keyurva marked this pull request as ready for review January 30, 2026 00:41
Copy link
Contributor

@clincoln8 clincoln8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Keyur! A couple nonblocking nits/suggestions.


use_search_indicators: bool = Field(
default=False,
alias="USE_SEARCH_INDICATORS",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we maintain the DC_ prefix for this envvar?

Suggested change
alias="USE_SEARCH_INDICATORS",
alias="DC_USE_SEARCH_INDICATORS",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Done.


def _call_fetch_indicators(self, queries: list[str]) -> dict:
"""
Helper method to call the new datacommons-client fetch_indicators method.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
Helper method to call the new datacommons-client fetch_indicators method.
Helper method to call the datacommons-client fetch_indicators and transform the response.
Returns:
dict[str, list[dict]]: A mapping where each key is the requested query and the
value is a list of candidate matches. Each candidate dict contains:
- 'SV' (str): The identifier (DCID) of the matched entity.
- 'CosineScore' (float): The similarity score for the match.
- 'alternate_descriptions' (list[str]): A list containing the
contextual sentence if available, otherwise an empty list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[dependency-groups]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this used? should we just put that in the project.optional-dependencies.test group?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should. Done.

use_search_indicators: bool = Field(
default=False,
alias="USE_SEARCH_INDICATORS",
description="Whether to use the legacy search-indicators endpoint logic (True) or the new client library logic (False).",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description="Whether to use the legacy search-indicators endpoint logic (True) or the new client library logic (False).",
description="Whether to use the legacy search-indicators endpoint (True) or the client library (False) for fetching indicators.",

Copy link
Contributor Author

@keyurva keyurva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!


def _call_fetch_indicators(self, queries: list[str]) -> dict:
"""
Helper method to call the new datacommons-client fetch_indicators method.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


use_search_indicators: bool = Field(
default=False,
alias="USE_SEARCH_INDICATORS",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Done.

"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[dependency-groups]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should. Done.

@keyurva keyurva merged commit ee499e4 into datacommonsorg:main Jan 30, 2026
8 of 9 checks passed
@keyurva keyurva deleted the resolve-indicators branch January 30, 2026 03:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants