Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
c2cc978
feat(data_input): add jretrievedwh subprocess wrapper + station catalog
clairemerker Jun 12, 2026
fdf18c4
test(data_input): cover StationCatalog.from_meta
clairemerker Jun 12, 2026
49479e9
feat(data_input): implement load_obs_data_from_jretrieve
clairemerker Jun 12, 2026
e903490
feat(data_input): forward truth root marker to jretrieve loader
clairemerker Jun 12, 2026
9babcc4
feat(workflow): make truth input conditional for live jretrieve source
clairemerker Jun 12, 2026
d172fb5
docs: document jretrievedwh truth source config + prerequisites
clairemerker Jun 12, 2026
cea206c
style: apply ruff format to jretrieve source and tests
clairemerker Jun 12, 2026
7aa6ab9
docs: correct SwissMetNet abbreviation SNM -> SMN
clairemerker Jun 12, 2026
5cd0fa8
feat(jretrieve): fail-fast check for jretrievedwh.py on PATH + OPR_HOME
clairemerker Jun 12, 2026
ba88587
fix issue with meas_group/stn_group
jonasbhend Jun 12, 2026
2182e3b
fix inadvertent change to config and update readme
jonasbhend Jun 12, 2026
54da55e
test(jretrieve): mock check_prerequisites in loader test for CI
clairemerker Jun 12, 2026
5e0caec
remove peakweather
jonasbhend Jun 12, 2026
a2e4e99
remove truth hash, as this is redundant (and not used)
jonasbhend Jun 12, 2026
62c6cf8
only retrieve necessary timesteps
jonasbhend Jun 15, 2026
4c92452
fail with error if not all time steps are available
jonasbhend Jun 15, 2026
0add965
use dedicated varda credentials for jretrieve
jonasbhend Jun 15, 2026
ae0a8c7
update dependencies
jonasbhend Jun 15, 2026
065a5d9
fix failing test
jonasbhend Jun 15, 2026
ca4a223
Add SP_10M to list of DWH parameters
jonasbhend Jun 18, 2026
7d52c90
Merge branch 'main' into feat/jretrieve
dnerini Jun 18, 2026
86d3380
fix README
jonasbhend Jun 18, 2026
bdfb59e
extend config to add lapse-rate correction (defaults to true)
jonasbhend Jun 18, 2026
3a8fc19
add elevation data for ICON and KENDA-CH1
jonasbhend Jun 18, 2026
c74a3e7
implement lapse-rate correction
jonasbhend Jun 18, 2026
5103ab4
Apply lapse-rate correction before computing scores
jonasbhend Jun 18, 2026
a6731ef
Add lapse-rate flag to truth-hash to trigger reruns
jonasbhend Jun 18, 2026
3378791
assert elevation is in station data
jonasbhend Jun 18, 2026
19f7a54
Test lapse-rate correction
jonasbhend Jun 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .jretrievedwh-conf.prod.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import base64
import json
import os
import urllib.request
from pathlib import Path


def _read_dotenv(path):
result = {}
try:
with open(path) as f:
for line in f:
line = line.strip()
if not line or line.startswith("#") or "=" not in line:
continue
key, _, value = line.partition("=")
result[key.strip()] = value.strip().strip('"').strip("'")
except FileNotFoundError:
pass
return result


_client_id = os.environ.get("JRETRIEVE_CLIENT_ID")
_client_secret = os.environ.get("JRETRIEVE_CLIENT_SECRET")
if not _client_id or not _client_secret:
_dotenv = _read_dotenv(Path(os.environ.get("JRETRIEVE_CONF_DIR", ".")) / ".env")
_client_id = _client_id or _dotenv.get("JRETRIEVE_CLIENT_ID")
_client_secret = _client_secret or _dotenv.get("JRETRIEVE_CLIENT_SECRET")
if not _client_id or not _client_secret:
raise RuntimeError(
"jretrieve credentials not found. Set JRETRIEVE_CLIENT_ID and "
"JRETRIEVE_CLIENT_SECRET in the environment or in a .env file next "
"to this script."
)

jretrieve_url = "https://service.meteoswiss.ch/jretrieve/api/v1"
with urllib.request.urlopen(
urllib.request.Request(
method="POST",
url="https://service.meteoswiss.ch/auth/realms/meteoswiss.ch/protocol/openid-connect/token",
data=b"grant_type=client_credentials",
headers={
b"Authorization": b"Basic "
+ base64.b64encode(f"{_client_id}:{_client_secret}".encode())
},
)
) as f:
auth_header = "Bearer " + json.loads(f.read().decode())["access_token"]
44 changes: 44 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,31 @@ You can then run it with:
evalml experiment path/to/experiment/config.yaml --report
```

### Truth sources

The `truth.root` value selects how the ground truth is loaded:

- **Analysis Zarr** — a path ending in `.zarr` (anemoi analysis dataset).
- **DWH / jretrievedwh** — a `jretrievedwh:` marker string fetching surface observations
(e.g. SMN) live from the MeteoSwiss data warehouse. Variables are mapped to
ICON names in SI units (temperatures in K, pressure in Pa, precipitation as the hourly
sum); wind `U_10M`/`V_10M` are derived from speed + direction.

Marker syntax (station selection is required; pick one of group/locations/bbox):

```yaml
truth:
label: SwissMetNet
root: jretrievedwh:1,2 # stn_group_id (default)
# root: jretrievedwh:locations=ARO,KLO,LUG # explicit nat_abbr list
# root: jretrievedwh:bbox=45.8,47.8,5.9,10.5 # minlat,maxlat,minlon,maxlon
```

**Prerequisites:** `jretrievedwh.py` must be on `$PATH` (falls back to
`/oprusers/osm/opr.inn/bin/jretrievedwh.py`) and DWH credentials must be
available — see [Credentials setup](#credentials-setup). No data is
pre-downloaded — the obs are queried at verification time.


## Installation

Expand Down Expand Up @@ -175,6 +200,25 @@ valid for 30 days. Every training or evaluation run within this period automatic
extends the token by another 30 days. It’s good practice to run the login command before
executing the workflow to ensure your token is still valid.

### DWH / jretrieve credentials

To use `jretrievedwh:` as a truth source, provide a client ID and secret for
the MeteoSwiss service account. Set them as environment variables:

```bash
export JRETRIEVE_CLIENT_ID=your-client-id
export JRETRIEVE_CLIENT_SECRET=your-client-secret
```

or place them in a `.env` file at the project root (next to `.jretrievedwh-conf.prod.py`):

```
JRETRIEVE_CLIENT_ID=your-client-id
JRETRIEVE_CLIENT_SECRET=your-client-secret
```

The `.env` file is listed in `.gitignore` and is never committed.

## Workspace setup

By default, data produced by the workflow will be stored under `output/` in your working directory.
Expand Down
7 changes: 6 additions & 1 deletion config/varda-single-1.0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,12 @@ runs:

truth:
label: SwissMetNet
root: output/data/observations/peakweather
root: jretrievedwh:1,2
# To verify against SwissMetNet observations from the DWH via jretrievedwh,
# set instead (requires jretrievedwh.py on $PATH and $OPR_HOME set):
# Other selectors: root: jretrievedwh:locations=ARO,KLO,LUG
# root: jretrievedwh:bbox=45.8,47.8,5.9,10.5
# append ;stage=devt to target a non-prod DWH stage

experiment:
params:
Expand Down
2 changes: 0 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ dependencies = [
"pyproj>=3.7.2",
"marimo>=0.23.3",
"geopandas>=0.14.0",
"peakweather",
"pyzmq>=27.1.0",
"scores>=2.0.0",
"eccodes>=2.44,<2.48",
Expand Down Expand Up @@ -60,5 +59,4 @@ packages = [
]

[tool.uv.sources]
peakweather = { git = "https://github.com/MeteoSwiss/PeakWeather.git" }
eccodes-cosmo-resources-python = { git = "https://github.com/MeteoSwiss/eccodes-cosmo-resources-python" }
Loading
Loading