Skip to content

Unify input output#117

Open
jo-mueller wants to merge 32 commits intoome:mainfrom
jo-mueller:unify-input-output
Open

Unify input output#117
jo-mueller wants to merge 32 commits intoome:mainfrom
jo-mueller:unify-input-output

Conversation

@jo-mueller
Copy link
Copy Markdown
Contributor

@jo-mueller jo-mueller commented Mar 26, 2026

Fixes ome/ngff#480
Fixes ome/ngff#437
Relevant for ome/ngff#360

Description

In previous versions of the spec, we had used a mix of contentions when it came to specifying the inputs and outputs of coordinate transformations:

  • In the scene context, input and output had to be an object with fields name and path.
  • For the multiscale transformations inside the multiscales metadata (multiscales > datasets > coordinateTransformations), only a string was allowed, which had to correspond to the path of the respective dataset entry.
  • For the additional transforms next to the multiscales metadata (multiscales > coordinateTransformations), a string was required, which had to correspond top the name of a coordinate system in the same metadata document.

Following the original suggestion of @dstansby at ome/ngff#437, this is unified into a common syntax in this PR. It also has adjustments for schemas, examples and CI tests.

cc @will-moore @dstansby @clbarnes @bogovicj @lorenzocerrone @jluethi @m-albert @thewtex

@jo-mueller jo-mueller added the enhancement New feature or request label Mar 26, 2026
@github-actions
Copy link
Copy Markdown

Automated Review URLs

@jo-mueller jo-mueller marked this pull request as ready for review March 26, 2026 16:31
@jo-mueller jo-mueller changed the title WIP: Unify input output Unify input output Mar 26, 2026
Copy link
Copy Markdown
Contributor

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, but 👍 this looks great overall

jo-mueller and others added 9 commits March 27, 2026 17:26
Co-authored-by: David Stansby <dstansby@gmail.com>
Co-authored-by: David Stansby <dstansby@gmail.com>
Co-authored-by: David Stansby <dstansby@gmail.com>
Co-authored-by: David Stansby <dstansby@gmail.com>
Co-authored-by: David Stansby <dstansby@gmail.com>
Co-authored-by: David Stansby <dstansby@gmail.com>
@jo-mueller
Copy link
Copy Markdown
Contributor Author

Regarding a discussion over at ome/ngff#339, I think it may be better to turn the requirement of the containing multiscales having to declare a coordinate transform to a contained label image into a SHOULD or even a MAY. Otherwise, doing the following thing essentially invalidates the parent multiscales image:

  1. You write a multiscales image IMG, complete with all metadata
  2. You run a segmentation workflow and store the result as another multiscales under image/labels/some_segmentation
  3. In the current state, if you wouldn't write a coordinate transform into IMG's metadata, it would now be invalid, which I think would be highly problematic.

Maybe the following would be better:

  • Requiring that multiscales under labels/wherever MUST only have one coordinate system.
  • If no coordinate transformation is written under IMG, it is implicit for "native coordinate system of IMG and label image under labels/whereever are linked by an identity transform"

index.md Outdated
| Context | `input` | `output` |
|---------|---------|----------|
| **multiscales > datasets** | `{ "path": "<dataset_path>" }` | `{ "name": "intrinsic" }`|
| **multiscales > coordinateTransformations** | `{ "name": "intrinsic" }` | `{ "name": "output" }` <br> or <br> `{ "name": "intrinsic", "path": "labels/labels_path" }` |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect any labels to be input rather than output, as the labels would be the lowest-level "leaves" of the tree that has it's apex as the "scene" at the top, with the multscale images "intrinsic" coordinateSystem in the middle.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do, but that means that we'd need to weaken the requirement in the multiscales section further down, where it says:

If applications require additional transformations,
each multiscales object MAY contain the field coordinateTransformations,
describing transformations that are applied to all resolution levels in the same manner.
The values of both input and output fields MUST be an object with fields name and path that satisfy:

  • The value of input MUST be the "intrinsic" coordinate system, referenced by name.
    The path field of input SHOULD be omitted.

The correct replacement of that statement would then be:

The value of either input or output MUST be the "intrinsic" coordinate system, referenced by name. The respective path field SHOULD be omitted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the multiscales requirements do need to be relaxed, both to allow labels as inputs, but also because you might have intrinsic -transformed-to-> deskewed -transformed-to-> rotated then the rotation transform would not have input as intrinsic. We'd want to allow that, right?

I'm still not clear about the "intrinsic" coordinateSystem rules. Is this a term that refers to the coordinate system that behaves as the intrinsic system (all datasets output to intrinsic), or is it the case that every multiscales image MUST have a coordinateSystem that has "name": "intrinsic"?
And should viewers always attempt to show the "intrinsic" coordinateSystem? If I have e.g. the intrinsic -transformed-to-> deskewed then I may/probably want any viewer to show the deskewed coordinateSystem?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might have intrinsic -transformed-to-> deskewed -transformed-to-> rotated

Nope, currently not allowed. all transforms under multiscales -> coordinateTransformations must be linked to the same coordinate system (the "intrinsic" coordinate system) to limit graph complexity. So to do what you are describing you would have to choose a sequence of intrisinc -(affine + rotate)-> rotated.

About the "intrinsic" CS, I'll add a clarifying remark further up. Essentially: The "intrinsic"/"native"/"physical" coordinate system is the one that all multiscale transforms out put to.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, understood. This all seems consistent now. Just unsure about "[the intrinsic coodindate system] should be used for viewing and processing unless a use case dictates otherwise".
If I'm implementing a viewer, how do I know whether to show the "intrinsic" coordinateSystem or some other (when just viewing the image, rather than the whole "scene")?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this definitely needs a clarifying remark further up as to what the "intrinsic" coordinate system denotes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - I was reminded that Davis asked about that too at #118 (comment)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this definitely needs a clarifying remark further up as to what the "intrinsic" coordinate system denotes.

Hey :) I was thinking about that...

I'd agree with Davis' #118 (comment) that defining the intrinsic coordinate system as the "native physical coordinate system" is a bit misleading. I think the intention behind "It should be used for viewing" is important, but in my view it is already captured by the description of the transformations which have the intrinsic coordinate system as output (in the datasets objects):

In these cases, the scale transformation specifies the pixel size in physical units or time duration.

Probably that's all an implementation could know for sure about physical coordinates.

With this and the discussion in #118 (comment) in mind, how about having the definition of the intrinsic coordinate system more descriptive:

To both initialize the coordinates of a multiscale image and to define the relative scaling factors between resolution levels, multiscale images have a special coordinate system, the "intrinsic" coordinate system. It is the coordinate system that serves as the common output coordinate system for all transformations specified for the objects in the datasets field of a multiscale object.

@will-moore
Copy link
Copy Markdown
Member

I think we need to review some of the existing labels statements, since we now MUST refer to labels with the input/output of a transform in the parent image.
E.g.
In the tree layout

# The labels group is a container which holds an array of labels to make the objects easily discoverable
# All labels will be listed in zarr.json e.g. { "labels": [ "original/0" ] }
# Multiscale, labeled image. The name is unimportant but is registered in the "labels" group above.

Although the layout rules are unchanged, these statements are incomplete as the layout is not the only way to discover labels now, and labels are not only listed in the labels/zarr.json.

Also: "Within the multiscales object, the JSON array associated with the datasets key MUST have the same number of entries (scale levels) as the original unlabeled image".

This came up at https://forum.image.sc/t/ome-zarrpari-an-ome-zarr-napari-widget/119772/9 as being overly strict. Especially if we now allow scale/translation to map labels to the parent image, this seems outdated.

@jo-mueller
Copy link
Copy Markdown
Contributor Author

I think we need to review some of the existing labels statements, since we now MUST refer to labels with the input/output of a transform in the parent image.

Yes, that's clearly too strict 🙈

@will-moore
Copy link
Copy Markdown
Member

will-moore commented Apr 1, 2026

I just thought of another issue with specifying identity, scale, translation transforms between labels and image coordinateSystems: All of those transforms will preserve all axes but I think most people assume that labels won't have a "channel" axis? There's been a proposal (somewhere) to say that labels shouldn't have a channel axis.

The spec refers to a label image as "(usually having the same dimensions and coordinate transformations)" as the parent image. But it doesn't say that it MUST have the same axes.

# Each dimension of the label should be either the same as the
# corresponding dimension of the image, or `1` if that dimension of the label
# is irrelevant.

So I think we need to clarify whether label images can omit the Channel axis. If label images don't have a channel axis, then how do we handle that with a transform that goes from coordinateSystem

labels (no channel)

{
    "name" : "my_label",
    "axes" : [
        {"name": "z", "type": "space", "unit": "micrometer"},
        {"name": "y", "type": "space", "unit": "micrometer"},
        {"name": "x", "type": "space", "unit": "micrometer"}
    ]
}

to image (with channel)

{
    "name" : "intrinsic",
    "axes" : [
        {"name": "c", "type": "channel"},
        {"name": "z", "type": "space", "unit": "micrometer"},
        {"name": "y", "type": "space", "unit": "micrometer"},
        {"name": "x", "type": "space", "unit": "micrometer"}
    ]
}

@mkitti
Copy link
Copy Markdown
Member

mkitti commented Apr 1, 2026

If I had two arrays with different or missing dimensions I would expect some sort of broadcasting to apply:
https://numpy.org/doc/stable/user/basics.broadcasting.html
https://blog.glcs.io/broadcasting

@m-albert
Copy link
Copy Markdown

m-albert commented Apr 2, 2026

It might not be at the core of this PR, but it's not entirely clear to me from the spec text whether the "intrinsic coordinate system" needs to actually have the name "intrinsic"? Or can it be any other name and we just call this specific coordinate system the intrinsic coordinate system? In case I didn't miss it, it could make sense to clarify that.

@jo-mueller
Copy link
Copy Markdown
Contributor Author

jo-mueller commented Apr 2, 2026

@m-albert @will-moore @d-v-b @jluethi Thanks for the feedback. I have made some changes of which I hope they adress the points raised above. Let me summarize the key changes with repsect to the last version you reviewed further up:

weaken requirements for transforms to labels

Previously, if a multiscale "owned" one or more label images under image/labels/label_imageA, it was required that it also specifies a valid transform into the label image's coordinate system (whichever). That was clearly too strict, because adding a label image to a multiscales group retrospectively would then have invalidated this group. It is now a MAY requirement; If no coordinate transformation to the labels is specified, than it is assumed that the "intrinsic" coordinate system of the two images are linked by an identity transform.

a propos "intrinsic"

I replaced some occurrences of the term "intrinsic", and added the following statement before the first occurrence, which ressembles what @m-albert suggested above:

Multiscale images have an "intrinsic" coordinate system.
It will be a representation of the image in its native physical coordinate system and
can be used for viewing and processing unless a use case dictates otherwise.

In terms of metadata, the coordinate system referred to as the "intrinsic" coordinate system in this document,
is the coordinate system that is referenced by all multiscale coordinate transformations under datasets as their output (see below).

formatting & examples

I changed the formatting of both the section on labels and multiscales, so it's hopefully clearer what is or isn't required and the text is a bit more accessible. I also added a more comprehensive example of how everything can be laid out in the labels section.

Edit: After some discussion regarding the role of coordinate transformations regarding semantic linkage of images and labels (see ome/ngff#339), I have reverted the assumption about the dimensionalities of image and labels to pretty much the previous state. While coordinate transformations provide a tool for better semantic linkage, it's not what they are designed for; Ultimately, this will be for RFC8 to solve. In the meantime, guiding users how to express coordinate transformations to label image as an optional metadata block to get some richer context seems to be the better way instead of shoehorning transforms to address all the existing shortcomings in the spec :)

@jo-mueller jo-mueller force-pushed the unify-input-output branch from a220826 to 94c010e Compare April 8, 2026 18:03
@jo-mueller
Copy link
Copy Markdown
Contributor Author

@bogovicj @perlman @LucaMarconato @d-v-b @will-moore @clbarnes

I added some more clarifying remarks on the meaning of an omitted path field in the input/output fields of transforms in 94c010e. Hope that makes it fit for usage!

Copy link
Copy Markdown
Member

@will-moore will-moore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the input/output objects.
Some outstanding queries about the linking of labels with images (channel dimensions etc). It's already a bit vague in v0.5, so we don't necessarily have to fix it right now, but any additional clarity would be appreciated. Thanks

@jluethi
Copy link
Copy Markdown
Contributor

jluethi commented Apr 9, 2026

@jo-mueller Sorry, catching up a bit late on this.

In the meantime, guiding users how to express coordinate transformations to label image as an optional metadata block to get some richer context seems to be the better way instead of shoehorning transforms to address all the existing shortcomings in the spec :)

Big fan of this sentiment! I hope we can find a narrow enough definition here to unblock things and get transformations through. Let's be aware that some of the more generic topics (how can my label belong to multiple images? How do I keep track of complex relationships between images & labels? etc.) will be best addressed in the context of collections. Yes, labels having to be subgroups of an image group is limiting, but figuring out the generic answer to avoid that & solve the issue how we still relate them is exactly what motivated the collections discussions. And they are non-trivial as well ;)

Previously, if a multiscale "owned" one or more label images under image/labels/label_imageA, it was required that it also specifies a valid transform into the label image's coordinate system (whichever). That was clearly too strict, because adding a label image to a multiscales group retrospectively would then have invalidated this group. It is now a MAY requirement; If no coordinate transformation to the labels is specified, than it is assumed that the "intrinsic" coordinate system of the two images are linked by an identity transform.

Great! Hearing that adding labels required updating image metadata was quite concerning to me.


In broader view of this, my hope is that to adopt OME-Zarr 0.6, I will need to update the metadata a bit, but as long as I don't rely on transformations, I don't need to majorly change what is stored or where metadata needs to be updated when I do things like adding labels (unless I want to specify their transformation).

I admittedly haven't managed to fully review this PR, but from looking at the changes in the index.md and the summary above, that does sound like it's taking the right direction.

@jo-mueller
Copy link
Copy Markdown
Contributor Author

@jluethi Thanks for the feedback!

In broader view of this, my hope is that to adopt OME-Zarr 0.6, I will need to update the metadata a bit, but as long as I don't rely on transformations, I don't need to majorly change what is stored or where metadata needs to be updated when I do things like adding labels (unless I want to specify their transformation).

This, exactly. Essentially, you would have to update the core part of the metadata (i.e., the multiscales metadata), but you could keep the rest of your metadata as is and it will be as valid as before.

If, for any reason, understanding the dimensions/locations of label images or discovering them in the first place were limiting for you in some way, you could alleviate that to a degree now by putting a reference to them in the owning image's metadata. You could also have label images that label only a part of the parent image (i.e., partial annotation of a large image) and you could express that with a coordinate transformation.

All of this is entirely optional; If none of this is written, the assumption about the relationship between image/label (provenance, transforms, etc) does fall more into the responsibility of the data producer :)

@lubianat lubianat added this to the 0.6 milestone Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

8 participants