Skip to content

fix: support manifests >5 GB via size-aware copy#7047

Merged
westonpace merged 2 commits into
lance-format:mainfrom
lixmgl:lance_pinterest_bug
Jun 22, 2026
Merged

fix: support manifests >5 GB via size-aware copy#7047
westonpace merged 2 commits into
lance-format:mainfrom
lixmgl:lance_pinterest_bug

Conversation

@lixmgl

@lixmgl lixmgl commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Fixes #7197.

Summary

  • object_store.copy() is called unconditionally on the staging→final manifest copy path. This routes to S3's CopyObject API, which has a ~5 GB hard cap. Manifests above this fail with EntityTooLarge — production case was a ~14 GB manifest with ProposedSize 14961429442 on _versions/...manifest.
  • Add copy_size_aware: keeps the cheap server-side store.copy() for sources below the limit, falls back to read+rewrite via a multipart upload for larger sources. The required size argument lets the caller skip an extra head() round-trip.
  • The 5 GiB threshold is backend-agnostic, not S3-specific: S3's CopyObject and GCS's single-shot Objects.copy both cap at ~5 GiB, so the constant is named MAX_SERVER_SIDE_COPY_BYTES. Stores without such a cap (e.g. local FS) take the read+rewrite fallback above 5 GiB too; correctness is preserved, only the rare large copy is slower.
  • Also tighten MAX_UPLOAD_PART_SIZE from 5 GiB to 5 GiB - 1 so LANCE_INITIAL_UPLOAD_SIZE=5368709120 can't trigger a single PUT of exactly 5 GiB on shutdown — which S3 also rejects.

Same bug class as #6750 (multipart-aware put for txn file writes), different code path.

Test plan

New tests in rust/lance/src/io/commit/external_manifest.rs covering both the >5 GB read+rewrite fallback and the small-file fast path.

Related downstream issue: lance-format/lance-spark#529

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions Bot added bug Something isn't working A-encoding Encoding, IO, file reader/writer and removed bug Something isn't working labels Jun 2, 2026
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 80.69307% with 39 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/io/commit/external_manifest.rs 86.01% 19 Missing and 1 partial ⚠️
...ust/lance-table/src/io/commit/external_manifest.rs 64.81% 15 Missing and 4 partials ⚠️

📢 Thoughts on this report? Let us know!

@westonpace westonpace left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions but nothing too worrisome. Go ahead and ping me when you're ready for another review.

Comment thread rust/lance-table/src/io/commit/external_manifest.rs Outdated
// Note: this does NOT cover task cancellation — `MultipartUpload`'s
// upstream Drop is documented as a no-op for S3/GCS. Callers that
// need cancellation cleanliness should run this with a guard or
// switch to `object_store::WriteMultipart` (planned follow-up).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there actually a follow-up issue created? Can we specify the issue directly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworded this from a "deferred to a follow-up" TODO into a plain inline NOTE rather than referencing a tracked issue.
It's a cold path (only triggers for >5 GiB manifests) and the whole read+rewrite helper is a stopgap until object_store exposes UploadPartCopy, so I didn't think a standalone tracking issue was warranted — a self-contained note explaining the tradeoff seemed more useful than an issue link that'd likely sit untouched.
Happy to file one if you'd prefer it tracked.

Comment on lines +309 to +312
if let Err(e) = upload.put_part(Bytes::from(payload).into()).await {
let _ = upload.abort().await;
return Err(e);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could maybe be optimized to allow for concurrent calls to put_part but maybe that will be addressed by the previously mentioned follow-up?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right — left sequential for now, covered by the reworded note just above. It can be parallelized with a bounded JoinSet (mirroring object_writer.rs's LANCE_UPLOAD_CONCURRENCY) or sidestepped by switching to object_store::WriteMultipart, which handles concurrency and abort-on-drop for free. Deferred since this only runs for >5 GiB manifests and the helper itself is interim until UploadPartCopy exists.

/// `EntityTooLarge`, so we clamp `LANCE_INITIAL_UPLOAD_SIZE` one byte
/// below that threshold to keep the buffer-fills-to-clamp single-PUT
/// path safe. See lance#6750 for the related txn-file write fix.
const MAX_UPLOAD_PART_SIZE: usize = 1024 * 1024 * 1024 * 5 - 1;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this limit S3-specific or common across all object stores? Should we limit the slow path to S3?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not actually S3-specific — GCS's single-shot Objects.copy has the same ~5 GiB limit. So rather than gate the slow path to S3, I reframed the threshold as the most-conservative server-side-copy limit across backends and renamed the constant to MAX_SERVER_SIDE_COPY_BYTES.

lixmgl added a commit to lixmgl/lance that referenced this pull request Jun 18, 2026
…nd-agnostic

Review feedback on PR lance-format#7047 (westonpace):

- Make `copy_size_aware`'s size parameter mandatory (`Option<u64>` -> `u64`)
  and drop the `head()` fallback. The only caller already knows the size,
  and the fallback was just an extra IOP the caller can avoid.

- Reframe the 5 GiB threshold as backend-agnostic rather than S3-specific.
  S3's CopyObject and GCS's single-shot Objects.copy both cap at ~5 GiB, so
  `S3_COPY_OBJECT_CAP_BYTES` -> `MAX_SERVER_SIDE_COPY_BYTES`. Stores without
  such a cap (e.g. local FS) take the read+rewrite fallback above 5 GiB too;
  correctness is preserved, only the rare >5 GiB copy is slower. Comments in
  copy_via_read_rewrite and on COPY_REWRITE_PART_SIZE reworded to "S3/GCS"
  / "the backend".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lixmgl lixmgl force-pushed the lance_pinterest_bug branch from c12c46a to ab8f3da Compare June 18, 2026 23:55
`object_store.copy()` is called unconditionally on the staging→final
manifest copy path. This routes to S3's CopyObject API, which has a 5 GB
hard cap, so manifests above this fail with `EntityTooLarge` (production
case: a ~14 GB manifest with ProposedSize 14961429442).

- Add `copy_size_aware`: keeps the cheap server-side `store.copy()` for
  sources below the limit, falls back to read+rewrite via multipart upload
  for larger sources. The required `size` argument avoids an extra `head()`
  round-trip (the only caller already knows the size).
- The 5 GiB threshold is backend-agnostic, not S3-specific: S3's CopyObject
  and GCS's single-shot Objects.copy both cap at ~5 GiB. Stores without such
  a cap (e.g. local FS) take the read+rewrite fallback above 5 GiB too;
  correctness is preserved, only the rare large copy is slower.
- Tighten `MAX_UPLOAD_PART_SIZE` from 5 GiB to 5 GiB - 1 so
  `LANCE_INITIAL_UPLOAD_SIZE=5368709120` can't trigger a single PUT of
  exactly 5 GiB on shutdown, which S3 also rejects.

Same bug class as lance-format#6750 (multipart-aware put for txn file writes), different
code path.

Fixes lance-format#7197.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lixmgl lixmgl force-pushed the lance_pinterest_bug branch from ab8f3da to 7b59d20 Compare June 22, 2026 17:55
@lixmgl

lixmgl commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

A few questions but nothing too worrisome. Go ahead and ping me when you're ready for another review.

Thanks @westonpace for reviewing it. Please take another look with the updates.

@westonpace westonpace left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your help! Will merge assuming CI passes.

@westonpace westonpace merged commit 1c3d201 into lance-format:main Jun 22, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-encoding Encoding, IO, file reader/writer bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: manifest commit fails with EntityTooLarge for manifests >5 GB (CopyObject hard cap)

2 participants