Summary
ExternalManifestStore calls object_store.copy(staging, final) unconditionally on the manifest commit path at external_manifest.rs:128 and external_manifest.rs:250. This routes to S3's CopyObject API, which has a 5 GB hard cap on source object size. Any manifest above this fails with EntityTooLarge and the commit cannot complete.
Reproduction
Production failure observed on a Pinterest CTAS workload:
- Manifest path:
_versions/<version>.manifest
- Reported
ProposedSize: 14961429442 bytes (~14 GB)
- Error:
EntityTooLarge from S3 on the staging→final manifest copy step
- No workaround except shrinking the manifest (which isn't always feasible — manifests grow with table version count and fragment count)
Why this matters
- Affects any user whose manifest grows past 5 GB. Manifest size scales with table version history and fragment count, so this is reachable on long-lived production tables, not a corner case.
- Crashes the commit; there is no graceful degradation. The CTAS or write workflow fails entirely.
- No
object_store-layer fallback today — the upstream object_store crate doesn't expose UploadPartCopy, so a workaround inside Lance is needed.
Proposed fix #7047
A copy_size_aware helper that:
- Keeps the cheap server-side
store.copy() for sources <5 GiB (the common case, no regression)
- Falls back to read+rewrite via multipart upload for sources ≥5 GiB
- Accepts a
size_hint so callers that already know the source size can skip an extra head() round-trip on the small-file fast path
Same bug class as #6750, different code path
#6750 fixed the analogous bug for transaction file writes (write_transaction_file was using inner.put(), hitting S3's 5 GB single-PUT limit). That PR was scoped to txn files and did not touch the manifest commit path.
Summary
ExternalManifestStorecallsobject_store.copy(staging, final)unconditionally on the manifest commit path atexternal_manifest.rs:128andexternal_manifest.rs:250. This routes to S3'sCopyObjectAPI, which has a 5 GB hard cap on source object size. Any manifest above this fails withEntityTooLargeand the commit cannot complete.Reproduction
Production failure observed on a Pinterest CTAS workload:
_versions/<version>.manifestProposedSize:14961429442bytes (~14 GB)EntityTooLargefrom S3 on the staging→final manifest copy stepWhy this matters
object_store-layer fallback today — the upstreamobject_storecrate doesn't exposeUploadPartCopy, so a workaround inside Lance is needed.Proposed fix #7047
A
copy_size_awarehelper that:store.copy()for sources <5 GiB (the common case, no regression)size_hintso callers that already know the source size can skip an extrahead()round-trip on the small-file fast pathSame bug class as #6750, different code path
#6750 fixed the analogous bug for transaction file writes (
write_transaction_filewas usinginner.put(), hitting S3's 5 GB single-PUT limit). That PR was scoped to txn files and did not touch the manifest commit path.