Skip to content

Investigate deleting job artifacts proactively #154

@farski

Description

@farski

When a Porter job creates an intermediary file (such as the source file being ingested, or a transcoded file before it's copied to its destination), those files are created in the artifacts S3 bucket. Those files are only used in the context of the Step Function execution, and unneeded after the job stops running, other than for debugging purposes.

Currently, cleanup of those files is handled by a lifecycle rule on the bucket: objects expire after 1 day.

We do a large enough volume of work in Porter that at any given time, that bucket is over 200 GB. So we're currently paying for every object to live for 24 hours, when they are only providing value for about a minute, in most cases (under 10 minutes in nearly all cases).

We could have the state machine proactively delete objects associated with the job right before it completes to significantly reduce our byte-seconds usage in S3.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions