Fix dataset prefix matching in STAC server by eloramirez1356 · Pull Request #9 · dClimate/dclimate-client-py

eloramirez1356 · 2026-06-15T17:45:18Z

This pull request improves the accuracy and reliability of dataset resolution in the dClimate client, specifically when dealing with datasets whose names are prefixes of other datasets (such as precipitation_total and precipitation_total_land). The changes ensure that only exact dataset matches are returned, preventing accidental conflation of similarly named datasets. The update also adds comprehensive tests and clarifies usage in the documentation.

Exact dataset matching and bug fix:

Refactored the dataset matching logic in resolve_cid_from_stac_server to use a new helper function _feature_matches_dataset, ensuring that only exact dataset IDs are matched and avoiding prefix collisions (e.g., precipitation_total vs. precipitation_total_land). [1] [2]

Testing improvements:

Added new tests to tests/test_stac_server_listing.py that verify resolve_cid_from_stac_server correctly handles datasets with similar prefixes, rejects prefix-only matches, and supports legacy ID fallback, ensuring robust behavior for edge cases.

Documentation update:

Updated the README.md to clarify the distinction between ERA5 and ERA5-Land datasets, including usage examples and recommendations to inspect dataset names before loading.

Minor codebase updates:

Added resolve_cid_from_stac_server to the import list in tests/test_stac_server_listing.py to support the new tests.

Summary by CodeRabbit

Documentation
- Added a README usage example showing how to load ERA5-Land datasets with the expected load_dataset pattern (including precipitation examples).
Bug Fixes
- Improved STAC CID resolution to use exact dataset matching, preventing collisions with similarly named datasets and improving legacy-id fallback behavior.
Tests
- Added unit tests for exact matching, collision handling, and legacy-id fallback cases in STAC CID resolution.
Chores
- Bumped package version to 0.5.9.

coderabbitai · 2026-06-15T17:45:37Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cd717e14-4736-4140-aff0-5d3e7690697d

📥 Commits

Reviewing files that changed from the base of the PR and between bb25f35 and 7b4e22f.

📒 Files selected for processing (1)

pyproject.toml

📝 Walkthrough

Walkthrough

resolve_cid_from_stac_server in stac_server.py gains two private helpers that perform exact dataset-id matching via the collection field and properties["dclimate:dataset_id"] (or a parsed fallback), replacing the old prefix-startswith filter. Three new unit tests cover the updated resolution behavior, README adds an ERA5-Land usage example, and the package version is bumped to 0.5.9.

Changes

STAC Strict Dataset Matching and ERA5-Land Docs

Layer / File(s)	Summary
Exact dataset matching implementation `dclimate_client_py/stac_server.py`	Two new internal helpers parse the dataset segment from a feature `id` and validate exact membership by checking `collection` then `dclimate:dataset_id` or parsed id. The existing prefix-startswith filter in `resolve_cid_from_stac_server` is replaced with the new exact-match helper.
Unit tests, README example, and release update `tests/test_stac_server_listing.py`, `README.md`, `pyproject.toml`	Import statement extended to include `resolve_cid_from_stac_server`. Three new tests verify exact-match CID resolution, `ValueError` on prefix-only collisions, and legacy-id fallback CID resolution. README adds an `async def main_era5_land()` example showing `load_dataset` calls for ERA5 and ERA5-Land precipitation datasets. Package version is bumped from 0.5.8 to 0.5.9.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐇 Hoppity-hop through the STAC fields I go,
No more confusing "land" from the plain ERA flow!
I sniff the dataset_id, I parse out the name,
Prefix collisions? No more of that game.
With tests and a README to light up the way,
Precipitation is perfectly sorted today! ☁️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix dataset prefix matching in STAC server' directly and concisely describes the main change—resolving a dataset resolution bug caused by prefix-based matching logic.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch load-dataset-bug

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dclimate_client_py/stac_server.py`:
- Around line 78-82: The code in this area has not been formatted according to
Ruff's formatting standards, causing the CI pre-commit check to fail. Run the
pre-commit tool or ruff-format command locally to automatically reformat the
code in the stac_server.py file to comply with the project's formatting
requirements, then commit these formatting changes before merging the pull
request.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 61e36dab-5972-4d68-b1de-07e328f04ad2

📥 Commits

Reviewing files that changed from the base of the PR and between 63f9a1c and 2b52153.

📒 Files selected for processing (3)

README.md
dclimate_client_py/stac_server.py
tests/test_stac_server_listing.py

fix: fix STAC server dataset prefix matching

2b52153

eloramirez1356 self-assigned this Jun 15, 2026

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread dclimate_client_py/stac_server.py Outdated

eloramirez1356 and others added 2 commits June 16, 2026 08:55

style: apply Ruff formatting to STAC server changes

bb25f35

fix: update version

7b4e22f

TheGreatAlgo merged commit bc93bb9 into main Jun 16, 2026
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix dataset prefix matching in STAC server#9

Fix dataset prefix matching in STAC server#9
TheGreatAlgo merged 3 commits into
mainfrom
load-dataset-bug

eloramirez1356 commented Jun 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

eloramirez1356 commented Jun 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eloramirez1356 commented Jun 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading