Generate per-pixel satellite embeddings at scale. Ports the HPC-based Tessera embedding pipeline to a cloud-native, distributed architecture that runs on any major cloud — or on a laptop (slowly).
A Python library for:
- Ingesting Sentinel-1, Sentinel-2, and Landsat data from open STAC catalogs into chunked Zarr stores.
- Cloud-masking and transforming the data with scientifically validated pipelines.
- Generating 128-dimensional Tessera embeddings via distributed GPU inference with Ray.
- Coarsening and assembling the output into analysis-ready stores at configurable resolution.
The domain code — the scientific transformations, the inference
engine, the Zarr I/O — is cloud-agnostic and orchestrator-agnostic.
It's plain Python over xarray, dask, zarr, ray, and fsspec.
Runs on one laptop or a thousand GPUs.
Alongside the library we ship reference orchestration: opinionated Prefect flows and AWS provisioning helpers that demonstrate how we run this at production scale. They are examples, not requirements.
- Not a universal orchestration framework. Prefect is the
recommended and only core-maintained orchestrator. If you use
Airflow, Dagster, Flyte, or Argo, the domain layer is a drop-in
library — you'll rewrite the thin flow layer in your orchestrator's
idiom. Community-maintained adapters for other orchestrators are
welcome (see Contributing); we review them for fit and correctness
but don't commit to maintaining them. See
docs/orchestrator-swap.mdfor a worked example. - Not a multi-cloud abstraction. AWS is the fully maintained
reference cloud. Other clouds (GCP, Azure, Kubernetes) are
supported by forking the provider templates —
src/tessera_embeddings/providers/aws/ray.pyandproviders/aws/dask.pyare explicit AWS glue you can use as a reference implementation, not an abstraction. Seesrc/tessera_embeddings/providers/README.md. - Not infrastructure-as-code. We ship Ray cluster YAML templates and Python provisioning helpers, not Terraform or CDK. You bring your own IaC to create VPCs, security groups, and IAM.
- Not a plugin system. Providers aren't discovered via
entry_points; you import the one you want. - Not a framework. No base classes to inherit, no interfaces to implement. Flows are reference compositions; the domain layer is functions you call.
tessera_embeddings is an inference library. The base install is the
ingestion pipeline (Sentinel-2/S1 data preparation, Zarr store management —
no torch, no Ray). Add [inference] for the Tessera embedding model and
distributed execution — that is what this library is for. The split is
practical: torch is large and CUDA variants are platform-specific.
# Typical install — ingestion pipeline + Tessera inference
pip install tessera_embeddings[inference]
# Full production stack — inference + Prefect orchestration + AWS:
pip install tessera_embeddings[inference,prefect,aws]
# GPU (CUDA 12.1) — install torch first so pip keeps the CUDA wheel:
pip install "torch==2.6.0+cu121" --index-url https://download.pytorch.org/whl/cu121
pip install "tessera_embeddings[inference]"For contributors:
git clone https://github.com/dClimate/tessera-embeddings
cd tessera-embeddings
uv sync --all-extras # resolves uv.lock; all extras + dev toolsuv.lock at repo root is the single lock file. See
docs/environment-setup.md for CUDA GPU
installs and platform guidance.
git clone https://github.com/dClimate/tessera-embeddings
cd tessera-embeddings
uv sync --all-extras # resolves uv.lock; all extras + dev tools
# End-to-end pipeline on the bundled Story-County, IA quickstart ROI.
# Ingest → cloud mask → CPU inference → assemble. Expect ~30+ minutes;
# a warning banner confirms CPU inference is slow before kicking off.
python -m tessera_embeddings.orchestration.runners.plain examples/quickstart/config.yaml
# Skip inference for fast ingest-only sanity checks (~5 min).
python -m tessera_embeddings.orchestration.runners.plain \
examples/quickstart/config.yaml --skip-inferenceThe default mode runs the full chain — inference and assembly are
coupled, so end-to-end is the primary demo. --skip-inference is the
fast path for contributors iterating on ingest changes without waiting
for CPU torch. Production inference always runs on GPU. See
docs/quickstart.md for prerequisites
(Earthdata Login credentials for OPERA; the model checkpoint is
pulled from HuggingFace automatically).
Two supported paths:
- Prefect + AWS (reference): Flows in
src/tessera_embeddings/orchestration/prefect/flows/run againstproviders/aws/ray.py+providers/aws/dask.py. Seedocs/providers/aws.mdfor AWS provisioning. - Your orchestrator + your cloud: Reuse the domain layer; port
the flow layer to your orchestrator; fork the provider templates
for your cloud. See
docs/orchestrator-swap.mdanddocs/providers/adding-your-own.md.
Three strict layers:
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Prefect flows │
│ orchestration/prefect/flows/ │
│ Reference orchestration. Swap this directory for yours. │
└────────────────────────────┬────────────────────────────────┘
│
┌────────────────────────────▼────────────────────────────────┐
│ Layer 2: Thin @task wrappers │
│ orchestration/prefect/tasks/ │
│ Prefect-specific retry, caching, logger bridge. │
└────────────────────────────┬────────────────────────────────┘
│
┌────────────────────────────▼────────────────────────────────┐
│ Layer 1: Domain (ingest/, inference/, storage/, config/) │
│ Plain Python. No Prefect. No AWS-specific code. │
│ Uses Ray for GPU parallelism, Dask for CPU scale-out. │
└─────────────────────────────────────────────────────────────┘
Prefect is 100% quarantined under orchestration/prefect/.
orchestration/runners/ is the Prefect-free peer.
Per-cloud provisioning lives separately:
┌─────────────────────────────────────────────────────────────┐
│ Providers (providers/aws/, providers/local/, …) │
│ providers/aws/ contains ray.py and dask.py. │
│ AWS is fully maintained; local is for demo/tests. │
└─────────────────────────────────────────────────────────────┘
Six hard rules enforced in CI:
- No
import prefectoutside the flow layer. - Stdlib
loggingin the domain layer, notget_run_logger(). - Config is pydantic, not a Prefect Block (Blocks load into pydantic at flow entry).
- Storage is fsspec, not orchestrator-specific filesystem abstractions.
- Secrets enter at flow entry and travel as plain values.
- Dask/Ray clients are passed in, never summoned below the flow layer.
If those rules hold, you can rewrite the flow layer for any orchestrator without touching the domain.
A subtle reality of distributed array workloads: the task graph your scheduler has to plan grows quadratically with how finely you chunk the data. Chunks too small means the scheduler spends more time managing tasks than tasks spend doing work. Chunks too large means workers can't fit a chunk in memory.
ROI: 20 km × 20 km, S2 reflectance, 10 m resolution, 12 dates:
chunks=200×200 (10× too small) chunks=2000×2000 (the right size)
───────────────────────────── ───────────────────────────────
□□□□□□□□□□ □□□□□□□□□□ □□□□□ ┌────────┐
□□□□□□□□□□ □□□□□□□□□□ □□□□□ │ │
□□□□□□□□□□ □□□□□□□□□□ □□□□□ │ ████ │ ← 1 chunk
□□□□□□□□□□ □□□□□□□□□□ □□□□□ │ ████ │
… 10 000 graph nodes … └────────┘
12 nodes
graph build: ~30 s graph build: <1 s
scheduler RAM: ~1 GB scheduler RAM: <50 MB
overhead: 95% of wall-clock overhead: <5%
Storage and read granularity are tuned separately. Ingest writes
INGEST_CHUNK_SIZE = 4000 storage chunks to keep the satellite-ingest
Dask graph small (¼ the spatial tasks), while inference reads
INFERENCE_CHUNK_SIZE = 2000 sub-tiles out of them — small enough to keep
peak GPU-node RAM in check. Zarr's oindex reads the 2000 sub-tile out
of a 4000 chunk without any alignment requirement. Go smaller on the
read size and the Dask scheduler hangs on graph construction; larger and
you OOM on a g5.2xlarge. If you change either, profile.
The hard-rule checks ship as a reusable module so downstream consumers (closed-source forks, community adapter contributors) can apply the same contract to their own code:
# Run against any source tree
uv run python -m tessera_embeddings.architecture_tests \
--source path/to/your_package/ \
--allowlist your-arch-allowlist.tomlThe allowlist file (TOML) documents intentional deviations (e.g.
"Prefect imports in my own orchestration/prefect/ are expected").
See
src/tessera_embeddings/architecture_tests/
for the rule definitions, allowlist schema, and worked examples.
This library follows semver for the documented public API surface.
Anything outside it — underscore-prefixed names, modules whose names
start with _, anything under tessera_embeddings.orchestration.prefect.* —
is implementation detail and may change between minor releases. The
full public-API surface is listed in
docs/public-api.md. External code should
depend only on items listed there.
src/tessera_embeddings/orchestration/runners/plain.py
is an orchestrator-free sequencer that calls the same domain
functions as the Prefect flows, without Prefect. By default it runs
the full end-to-end pipeline (ingest → cloud mask → inference →
assembly) on a laptop with torch on CPU via Ray's local mode. Slow on
real workloads, practical on the Story-County quickstart ROI we ship
for exactly this purpose.
A --skip-inference flag runs only ingest for fast sanity checks;
assembly is skipped because it has nothing to assemble without
embeddings.
Why end-to-end on CPU is the credibility bar we chose:
- Assembly depends on inference outputs — "ingest-only" is a convenience path for contributors, not a meaningful full-stack demo.
- If CPU torch works without modification, no GPU-specific coupling has leaked into the domain layer. That's the strongest architectural separation check we can make without deploying to multiple cloud targets.
plain.pyis the reference for users porting to Airflow/Dagster/Flyte: everything it does is the non-Prefect wiring they'll need to reproduce.
For CI: plain.py --skip-inference is the fast PR check (minutes).
The end-to-end run on the quickstart ROI runs as a nightly or
opt-in job (too slow for every PR). Fast PR checks also use
AST-based architecture rules (§Architecture) to catch Prefect leaks
at the import level without running the pipeline.
src/tessera_embeddings/
config/ pydantic config models
ingest/ STAC ingestion, ROI rasterization, auth
inference/ GPU inference (Ray actors, work-stealing scheduler)
storage/ Zarr stores, manifests
orchestration/
concurrency.py sliding_window_submit — shared by flows + runners
prefect/ Prefect — 100% quarantined here
flows/ @flow-decorated orchestration (Layer 3)
tasks/ thin @task wrappers (Layer 2)
runners/ non-Prefect entry points (plain.py)
providers/ concrete cloud-provisioning glue
aws/ ray.py, dask.py, gotchas.md
local/ ray.py, dask.py
architecture_tests/ reusable layer-rule checker (CLI + Python API)
docs/quickstart.md— laptop demo end-to-end, including GPU inference.docs/environment-setup.md— lock files, CUDA variants, uv setup.docs/configuration.md— the pydantic config tree.docs/prefect-setup.md— standing up your own Prefect server: work pool shape, Blocks used, deployment examples, common gotchas. We don't ship IaC for the server itself; this doc tells you what to build.docs/providers/aws.md— running on AWS with Prefect.docs/providers/adding-your-own.md— porting to GCP, Azure, k8s.docs/orchestrator-swap.md— running without Prefect.docs/public-api.md— the documented public API surface covered by semver.src/tessera_embeddings/providers/aws/gotchas.md— operational knowledge for Ray clusters (head sizing, autoscaler, spot, AMI bake, teardown safety nets).context_docs/— design decisions, framing, rationale.
This library has a known production downstream consumer:
yield_modeling, a private repo that imports this library, supplies
AWS infrastructure, and runs production workloads. We've wired the
OSS CI to run a fast smoke test against yield_modeling on every
PR — catches accidental breaking changes at the point of change
instead of in production.
The smoke-test workflow lives at
.github/workflows/downstream-smoke.yml. It is initially disabled
(only workflow_dispatch enabled, no pull_request trigger).
Activation criteria:
yield_modelinghas its first internal release.- A read-only GitHub token (
YIELD_MODELING_READ_TOKEN) is configured as a repo secret. yield_modeling/mainreliably has a green test suite.
Once active, the smoke test runs yield_modeling's
pytest tests/unit tests/architecture against the OSS PR's SHA.
Failure is informational, not blocking — it gives the OSS PR
author a heads-up about downstream impact. We never make this a
required status check; that would give a private repo veto power
over public releases.
Other downstreams (community adapters, external production users) can wire up the same pattern against their own forks. See the smoke-test workflow file for the template.
We accept:
- Bug fixes and improvements to the domain layer.
- Documentation and examples.
- Additional reference provider implementations (new clouds, new
substrates). Ship them as concrete code under
providers/<your-target>/, not as abstractions. Seedocs/providers/adding-your-own.md. - Community-maintained orchestrator adapters (Airflow, Dagster,
Flyte, Argo, …). These are welcome but are not core-maintained.
Requirements for acceptance:
- Explicit maintenance commitment from the contributor,
named in the adapter's own README. If the named maintainer
goes silent and the adapter falls into disrepair, it will be
moved to an
archived/directory with a deprecation notice — not deleted, but clearly labeled as unmaintained. - Parity test against
runners/plain.pyon the bundled quickstart ROI, in CI. Your adapter's flow must produce identical output to the plain runner for the same inputs. Seetests/parity/adapter_template/for the starter template. - Parity doc — a short markdown file listing which features map cleanly from our Prefect reference, which have idiomatic equivalents in the new stack, and which have no analog.
- Clear labeling — the adapter's README and module docstring both state "community-maintained, not core-supported." Core maintainers will review for correctness and fit, but won't debug adapter-specific issues or unblock adapter-only breakages.
- Explicit maintenance commitment from the contributor,
named in the adapter's own README. If the named maintainer
goes silent and the adapter falls into disrepair, it will be
moved to an
We don't accept:
- Abstract
Runner/Orchestrator/Providerinterfaces. The architecture deliberately avoids them. Seecontext_docs/decisions/for the reasoning.
Apache-2.0. See LICENSE.
Ports the Tessera pipeline to a cloud-native architecture. Built at Cyclops.