Skip to content

Types of Tasks #43

@matthewhanson

Description

@matthewhanson

In @gadomski's PR #42 several types of tasks are defined.

class Task(BaseModel, ABC, Generic[Input, Output]):
    """A generic task."""

class PassthroughTask(Task[Anything, Anything]):
    """A simple task that doesn't modify the items at all."""

class StacOutputTask(Task[Input, Item], ABC):
    """Anything in, STAC out task."""

class ItemTask(StacOutputTask[Item], ABC):
    """STAC In, STAC Out task.

class HrefTask(StacOutputTask[Href], ABC):
    """Href in, STAC Out task.

I really like this way to define the input and output for different types of tasks, especially if it gives us JSON Schema!

Want to review these two Tasks:
StacOutputTask - Anything in, STAC out task.
HrefTask - Href in, STAC Out task

These tasks captures the need to create STAC Items from scratch. In the current payload structure you pass in parameters to the task in the process definition, you don't hand them in as part of the Task Input (which would normally be a FeatureCollection. So the href (or multiple hrefs), along with other parameters, would be provided in the process.tasks.taskname.parameterfield. I think that should be the preferred model and Input/Output is always going to be STAC Items, or nothing.

Next is the ItemTask which defines a single Items, but stac-tasks current are ItemCollections. A STAC task can take in 1 or more STAC Items as input, and returns 1 or more STAC Items. Note that this is not 1:1, a task doesn't process each item independently to create an array of output items (although you could write a task to do that). A task might take in one Item and create two derived Items from it, or it takes in an Item of data and a few other Items of auxiliary data used in the processing to create a single output Item.

Each task would have requirements on the number of input Items.

So I'd propose
StacOutputTask - Nothing in, STAC out task
ItemCollectionTask - ItemCollection in, ItemCollection out

I suppose we could also have an ItemTask for single Item input and output (a most common scenario), but I'm not sure I see the advantage over using ItemCollection with 1 Item.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions