feat(esm-tools-plus/simcat): add esm_catalog — STAC-based experiment catalog#1473
Open
siligam wants to merge 2 commits into
Open
feat(esm-tools-plus/simcat): add esm_catalog — STAC-based experiment catalog#1473siligam wants to merge 2 commits into
siligam wants to merge 2 commits into
Conversation
… for ESM-Tools Introduces the esm_catalog package as part of the ESM-Tools-plus/simcat initiative. Provides a DuckDB-backed STAC API, scanner, CLI, and MCP server for cataloguing and querying climate model experiment output. See src/esm_catalog/ARCHITECTURE.md for design overview. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move catalog deps (duckdb, pyarrow, pystac, shapely, cfgrib, etc.)
to extras_require["catalog"] to fix pip install in CI
- Add from __future__ import annotations for Python 3.8 compatibility
- Guard duckdb import with try/except for optional install
- Exclude src/esm_catalog and src/esm_viz from pytest auto-discovery
- Fix esm_motd _get_real_dir_from_pth_file("") crash with try/except
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Part of the ESM-Tools-plus/simcat initiative, which adds a STAC-based experiment catalog to ESM-Tools for indexing, querying and browsing climate model output.
This PR introduces
esm_catalog— the catalog API, scanner, and CLI.A companion PR adds
esm_viz— the visualization service.What's included
src/esm_catalog/— the package (108 files, ~32k lines)api/scan/stac/storage/hpc/integration/mcp/cli.pyesm-catalogCLI (scan, serve, mcp, reindex, …)STAC extensions defined:
hpc— HPC system name on assets (hpc:system)namelist— F90 namelist parameters (nml:*)paleo— deep-time metadata (paleo:years_bp)datacube— variable/dimension axescontacts— experiment owner infoAPI highlights:
GET /collections— all experiments, filterable via CQL2GET /experiments— experiment-level search with CQL2 (component=,variable=,nml:radctl.co2vmr > 284)GET /search— STAC item searchGET /queryables— OGC queryables for browser filter UIGET /paleo-presets— named paleo time periods (LGM, mid-Holocene, …)GET /collections/{id}/nml-parameters— namelist params for an experiment/personal/…)/catalogs) + web admin UI at/uiOption A catalog layout (current): one STAC Collection per experiment, components stored as item properties (
properties.component), enabling cross-component queries without per-component collection proliferation.tests/test_esm_catalog/— test suite~4000 lines covering API, CLI, storage, scanning, personal collections, integration, HPC detection, and STAC object construction.
Shared infrastructure (also in this PR)
pixi.toml/pixi.locksetup.py.github/workflows/docker-esm-catalog.ymldocs/esm_catalog_*.rstexamples/fesom_stac/examples/echam_grib/examples/notebooks/utils/Key design decisions
Test plan
pixi run pytest tests/test_esm_catalog/passesesm-catalog scan --helpandesm-catalog serve --helprun without erroresm-catalog scan <experiment_path>produces acatalog.duckdbesm-catalog serve --catalog catalog.duckdbservesGET /collectionsandGET /searchGET /queryablesreturns variables, components, experiment types?filter=variable='sst'returns only SST items🤖 Generated with Claude Code