docs: ADR for uploading large files by shepilov · Pull Request #4653 · cozy/cozy-stack

shepilov · 2026-01-28T15:31:51Z

No description provided.

Crash-- · 2026-01-29T10:26:39Z

Great write up @shepilov !

I tried to understand but how do you ensure that all chunks are ok and then the full object is OK (aka not corrupted) too? Is there a risk for this kind of issue (saying OK upload is done, but then file is corrupted at the end)?

I would like to have Luc reviewing this for the swift / slo part and if there is any consequence on his side.

shepilov · 2026-01-30T10:05:52Z

Great write up @shepilov !

I tried to understand but how do you ensure that all chunks are ok and then the full object is OK (aka not corrupted) too? Is there a risk for this kind of issue (saying OK upload is done, but then file is corrupted at the end)?

I would like to have Luc reviewing this for the swift / slo part and if there is any consequence on his side.

It's in " MD5/Checksum Handling" section. In case of using SLO and not manually managing chunks, we rely on Swift and just verify the checksum for each chunk(Swift lib), and compute md5 manually to verify it for the whole file

taratatach · 2026-01-30T10:44:39Z

docs/adr/002-large-file-upload-chunking.md

+1. Swift has a ~5GB single object limit
+2. Very large file uploads can stress server resources
+3. Long uploads are more prone to network failures
+4. We plan to add S3 API support, which has similar chunking needs (multipart uploads)


You mean as a storage backend or provide this API to our users?

mmm, yes, allow users to upload big files as well

Today at least backend storage @taratatach. For some customers, we may want to use S3 backend instead of swift (because of secnumcloud certification or things like that)

Having an S3 compatible API can be cool, but I don't know the cost of having the 2 API.

taratatach · 2026-01-30T10:48:09Z

docs/adr/002-large-file-upload-chunking.md

+- Same transparent approach can be used
+- Observe S3 protocol limits: a single object maxes out at 5TB, uploads can have at most 10,000 parts, and each part must be between 5MB and 5GB (last part can be smaller)
+- Pick part sizes small enough (and configurable) so the 10,000-part limit still covers the largest supported file; anything larger than 5TB must be chunked at the application level because the S3 API itself forbids it
+- **Note on checksums:** Like Swift SLO, S3 multipart ETags are not MD5 hashes of the content (they're a hash of part ETags). Application-side MD5 computation will be required for S3 multipart uploads, using the same pattern as Swift SLO


Isn't there a small mistake here?

Suggested change

- **Note on checksums:** Like Swift SLO, S3 multipart ETags are not MD5 hashes of the content (they're a hash of part ETags). Application-side MD5 computation will be required for S3 multipart uploads, using the same pattern as Swift SLO

- **Note on checksums:** Like Swift SLO, S3 multipart ETags are not MD5 hashes of the content (they're a hash of part content). Application-side MD5 computation will be required for S3 multipart uploads, using the same pattern as Swift SLO

If not, I don't really understand.

taratatach · 2026-01-30T15:10:40Z

docs/adr/002-large-file-upload-chunking.md

+
+The `CreateFile` method will be extended to:
+1. Read segment size and SLO threshold from configuration
+2. Determine whether to use SLO based on file size (files larger than threshold) or streaming mode (unknown size, indicated by negative `ByteSize`)


Do we have to know the file size before hand to use SLO?

Reading the "quota enforcement" paragraph it seems that we don't but then I don't really understand this sentence.

At least we know it from the request header, but it's not 100%, we can rely on it at first to what method to use but not for quota, we need to calculate it on the fly as well

taratatach · 2026-01-30T15:14:25Z

docs/adr/002-large-file-upload-chunking.md

+
+#### 3. MD5/Checksum Handling
+
+**Important**: SLO manifests don't return a single MD5 hash like regular objects. The manifest's ETag is a hash of the segment ETags, not the content.


Again you're saying here that the manifest's ETag is a hash of ETags. If that's not a mistake (i.e. we should not read "hash of segment") then what's the use of that manifest ETag?

it's a MD5( segment1_etag + segment2_etag + ... )

taratatach · 2026-01-30T15:15:32Z

docs/adr/002-large-file-upload-chunking.md

+We will implement application-side MD5 computation:
+
+1. Create a new `swiftLargeFileCreationV3` struct that holds an MD5 hasher alongside the Swift file writer
+2. On each `Write()` call, update the MD5 hash before passing data to Swift


I'm curious about how we can update the hash without the full file content.

You don't need the full file content to calculate a hash with MD5; you do it in chunks

taratatach · 2026-01-30T15:17:14Z

docs/adr/002-large-file-upload-chunking.md

+
+**Failure Scenarios:**
+
+1. **Client disconnects mid-upload**: The `swiftLargeFileCreationV3.Close()` is never called; segments remain orphaned


Can't we call Close() in a defer function so it's called every time the handler finishes whether it's because the upload is done or the connection was closed?

Yes, and we have to do it

taratatach · 2026-01-30T15:20:43Z

docs/adr/002-large-file-upload-chunking.md

+- On upload error: immediate best-effort delete
+- On server startup: schedule GC job
+- Periodically: run GC worker (configurable interval)
+- Manual: `cozy-stack swift gc-segments` CLI command


We have a cozy-stack check fs command that runs some consistency checks already. We might want to add the segments check to it.

yes, it can be part of check fs, but cleanup is destructive, and should be explicit

taratatach · 2026-01-30T15:27:43Z

docs/adr/002-large-file-upload-chunking.md

+**Deletion pattern:**
+
+We will implement a `deleteObject` helper method that:
+1. First attempts `LargeObjectDelete` (which handles both SLO manifests with their segments and regular objects)


Do you think this could add a lot of overhead when deleting many files? Would it be better to use SLO for all files even ones which are smaller that the threshold?

We can use also ObjectDelete for all object, for SLO it deletes only manifest and content will be deleted after with GC. But I'm a little but afraid of using GC for every file and prefer keep for extrodinay cases and even half manual.
There is also variants to store flag in file doc

taratatach · 2026-01-30T15:29:12Z

docs/adr/002-large-file-upload-chunking.md

+Swift doesn't support copying SLO manifests directly. Available options:
+1. **Copy manifest content and update segment references** - Complex: requires parsing manifest JSON, copying each segment individually, updating references
+2. **Download and re-upload** - Simple but slow: streams entire file through the server
+3. **Copy segments individually then create new manifest** - Medium complexity: server-side segment copy + new manifest creation


Does Swift provide helpers to create manifests from a list of segments?

Yep, there is and API to create manifest

taratatach · 2026-01-30T15:32:30Z

Do we have to change what we store in io.cozy.files documents when using SLO or do we just need to store the manifest id like we would store an object id?

taratatach · 2026-01-30T15:54:25Z

docs/adr/002-large-file-upload-chunking.md

+- Requires new HTTP API endpoints (`POST /files/chunks/start`, `PUT /files/chunks/{id}`, `POST /files/chunks/{id}/complete`)
+- Significant UI/client changes required
+- Server must track upload sessions and handle cleanup of incomplete uploads
+- Adds complexity to CouchDB (need to track chunk metadata)


Couldn't we create a SLO manifest and create chunks by hand in this case? I'm just asking because I think adding this functionality later could be nice (for resuming uploads, parallel uploads…).

Yes, when we are ready to do it, we can create segments separately and then create a manifest with uploaded chunks

shepilov · 2026-01-30T16:49:24Z

Do we have to change what we store in io.cozy.files documents when using SLO or do we just need to store the manifest id like we would store an object id?

The manifest path is the object path; we don't need to change anything, that's the main benefit compared to "manual chunking"

docs: ADR for uploading large files

4728d03

shepilov requested a review from a team as a code owner January 28, 2026 15:31

taratatach reviewed Jan 30, 2026

View reviewed changes

docs: Added a DLO section to ADR

ea885eb

taratatach reviewed Jan 30, 2026

View reviewed changes

	- Note on checksums: Like Swift SLO, S3 multipart ETags are not MD5 hashes of the content (they're a hash of part ETags). Application-side MD5 computation will be required for S3 multipart uploads, using the same pattern as Swift SLO
	- Note on checksums: Like Swift SLO, S3 multipart ETags are not MD5 hashes of the content (they're a hash of part content). Application-side MD5 computation will be required for S3 multipart uploads, using the same pattern as Swift SLO


		#### 3. MD5/Checksum Handling

		Important: SLO manifests don't return a single MD5 hash like regular objects. The manifest's ETag is a hash of the segment ETags, not the content.


		Failure Scenarios:

		1. Client disconnects mid-upload: The `swiftLargeFileCreationV3.Close()` is never called; segments remain orphaned

Conversation

shepilov commented Jan 28, 2026

Uh oh!

Crash-- commented Jan 29, 2026

Uh oh!

shepilov commented Jan 30, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

taratatach commented Jan 30, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shepilov commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants