Skip to content

Conversation

@wdhongtw
Copy link
Collaborator

@wdhongtw wdhongtw commented Dec 4, 2025

Description

Reduce image size and enhance caching.

  • Mount cache directory across layers when necessary.
  • Allow cache directory usage for pip command.

There are unnecessary cache files tracked in different layers of vllm/vllm-tpu images.

After apply this change, the raw image size (uncompressed) reduce from 25.9GB to 19.5GB.
Saving about 25% on disk usage, less chance for blow up TPU VM.

wdhongtw/vllm-tpu   latest   07dbf76dbed8   3 minutes ago   19.5GB
vllm/vllm-tpu       nightly  10463ce9c327   9 hours ago     25.9GB

Deployment can also be faster as the network transmission and extracting also take less time.

Tests

No test needed.

Checklist

Before submitting this PR, please make sure:

  • I have performed a self-review of my code.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have made or will make corresponding changes to any relevant documentation.

- Mount cache directory across layers when necessary.
- Allow cache directory usage for pip command.

Signed-off-by: Weida Hong <wdhongtw@google.com>
Copy link
Collaborator

@py4 py4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QiliangCui should we use /mnt/disks/persist as the cache location? ignore

Copy link
Collaborator

@QiliangCui QiliangCui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: 1 "-mount=type=cache,target=/root/.cache/pip",
does it mean some information between build is cached somewhere?
If yes, where is it cached? Local machine, a mounted disk or somewhere?

question 2: instead of using it and add the "-mount=type=cache,target=/root/.cache/pip" in time run PIP install, can we simply do a "clean cache" by end of image build to reduce the size?

@wdhongtw
Copy link
Collaborator Author

wdhongtw commented Dec 8, 2025

question: 1 "-mount=type=cache,target=/root/.cache/pip", does it mean some information between build is cached somewhere? If yes, where is it cached? Local machine, a mounted disk or somewhere?

It should be a filesystem allocated and mounted to the container during the build process.
Where is this filesystem located is a implementation detail of the image builder.

For docker, there are some blob files and metadata and under /var/lib/docker/buildkit.

See official document for more details.

question 2: instead of using it and add the "-mount=type=cache,target=/root/.cache/pip" in time run PIP install, can we simply do a "clean cache" by end of image build to reduce the size?

Yes we can and simply remove the cache in same RUN step or just use --no-cache-dir option for pip,
and these kind of solution does reduce the image size. But specify a cache mount have another benefits
that speed up the image building process (especially when pip does not handle package download in parallel.)
Thus I'll prefer setting the cache mount personally.

See also how vllm use cache mount for uv.

@CienetStingLin CienetStingLin merged commit 8f177b7 into vllm-project:main Dec 10, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants