Skip to content

CF coordinate support for array datasets (and more) #6

@jkeifer

Description

@jkeifer

The concept of a dataset for array data, essentially, a group of arrays sharing a common grid. Affine grids are easy to support here, assuming the affine transform is embedded in the native metadata. But datasets using grids defined by CF conventions are a bit of a challenge.

Thinking more about the idea of CF coordinate arrays and how that works in Zarr, I'm wondering if this is a bigger issue, where the idea of a dataset as a set of arrays sharing a common grid is actually limiting. Is a dataset actually like a group? This feels like a bad direction to me because then we end up with this hierarchy that we have to model...maybe this could be like static STAC catalogs vs dynamic (STAC API) catalogs, where in the dynamic case the hierarchy is limited?

Does this somewhat equate a dataset with a collection and an array with an item? How does this work with tabular data? Does this actually make sense because you might want to group tables together into a collection by some common relation/use-case?

I get really worried that this API will become a general organizational paradigm when I start thinking like this. The scope should be intentionally constrained to keep the focus on serving chunk bytes, not on facilitating search or organization...

Anyway, it does seem like coordinate arrays need to be accessible via a separate endpoint in some way. Maybe this is by making them datasets, and linking to them from the dataset(s) to which they apply. The great thing about this idea is that different datasets that share a common grid can link to the same grid--of course you should then simply merge those datasets together because they have a common grid...but that may not always be possible, say you want to use a grid defined by a dataset you do not control. Maybe common grids could be served from a reference location, so they don't have to be duplicated for every single dataset that uses them?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions