Skip to content

feat(build): Cache pulled artifacts#216

Open
jedevc wants to merge 1 commit intostagingfrom
jedevc/cache-registry-data
Open

feat(build): Cache pulled artifacts#216
jedevc wants to merge 1 commit intostagingfrom
jedevc/cache-registry-data

Conversation

@jedevc
Copy link
Copy Markdown
Member

@jedevc jedevc commented Mar 20, 2026

If we're pulling rootfs or kernel from an OCI registry, then we should probably cache it locally. There's no real reason that we should try and pull it from the harbor registry over and over again.

It's all content hashed anyways.

If we're pulling rootfs or kernel from an OCI registry, then we should
probably cache it locally. There's no real reason that we should try and
pull it from the harbor registry over and over again.

It's all content hashed anyways.

Signed-off-by: Justin Chadwell <justin@unikraft.com>
// WrapCached wraps a File with caching support. If the file has a backing
// provider and descriptor, it returns a new ContentStoreFile that uses a
// pull-through cache at ~/.cache/unikraft. Otherwise, returns the original file.
func WrapCached(ctx context.Context, file spec.File) spec.File {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, we should try and only cache if it's fetched from a remote registry. If it's a local content store, then probably not.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay to have no automatic cache cleanup? Probably not.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a local pull-through cache for OCI-backed kernel/rootfs artifacts so repeated builds don’t re-pull identical (content-addressed) blobs from the registry.

Changes:

  • Introduces a WrapCached helper that wraps unikraft.com/x/image-spec files with a local containerd content-store cache under the user cache directory.
  • Uses the caching wrapper when assembling images in the builder.
  • Updates unikraft.com/x/image-spec dependency version (and promotes containerd/errdefs to a direct dependency).

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
internal/x/imagespec/cache.go Implements pull-through caching via a local containerd content store and a proxy content.Provider.
internal/builder/build.go Wraps kernel/initrd artifacts with the new cache-enabled file wrapper.
go.mod Bumps unikraft.com/x/image-spec and makes github.com/containerd/errdefs a direct dependency.
go.sum Updates checksums for the bumped unikraft.com/x/image-spec version.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +95 to +100
if desc.Size <= 0 {
desc.Size = ra.Size()
}
if desc.Size <= 0 {
return nil
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cacheBlob returns nil even when it decides it cannot write anything (e.g., missing/unknown size). Callers currently interpret nil as "cached successfully". Consider returning a sentinel error or an additional boolean indicating whether a blob was actually written, so callers can safely decide whether to switch to the cache.

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +85
// Close the upstream reader and return from the cache instead
ra.Close()
return p.cache.ReaderAt(ctx, desc)
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cache-write success path, this closes the upstream reader and then blindly switches to p.cache.ReaderAt. If cacheBlob was a no-op (e.g., it returns nil when desc.Size <= 0), or if the cache read fails for any reason, this will return an error even though the upstream reader was valid (and has now been closed). Consider only closing upstream after successfully opening the cached reader, and falling back to the upstream reader when cache open fails; alternatively, make cacheBlob return an explicit "cached" boolean so the caller doesn’t assume the blob exists in the cache.

Suggested change
// Close the upstream reader and return from the cache instead
ra.Close()
return p.cache.ReaderAt(ctx, desc)
// Try to open a new reader from the cache after writing.
cachedRA, err := p.cache.ReaderAt(ctx, desc)
if err == nil {
// Successfully opened from cache: close upstream and return cached reader.
ra.Close()
log.G(ctx).Debug().
Str("digest", desc.Digest.String()).
Msg("content cache read after write")
return cachedRA, nil
}
// If the cache read after write fails for any reason, fall back to the
// still-open upstream reader instead of returning an error.
log.G(ctx).Debug().
Err(err).
Str("digest", desc.Digest.String()).
Msg("content cache read after write failed; using upstream reader")
return ra, nil

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants