Conversation
|
Great write up @shepilov ! I tried to understand but how do you ensure that all chunks are ok and then the full object is OK (aka not corrupted) too? Is there a risk for this kind of issue (saying OK upload is done, but then file is corrupted at the end)? I would like to have Luc reviewing this for the swift / slo part and if there is any consequence on his side. |
It's in " MD5/Checksum Handling" section. In case of using SLO and not manually managing chunks, we rely on Swift and just verify the checksum for each chunk(Swift lib), and compute md5 manually to verify it for the whole file |
| 1. Swift has a ~5GB single object limit | ||
| 2. Very large file uploads can stress server resources | ||
| 3. Long uploads are more prone to network failures | ||
| 4. We plan to add S3 API support, which has similar chunking needs (multipart uploads) |
There was a problem hiding this comment.
You mean as a storage backend or provide this API to our users?
There was a problem hiding this comment.
mmm, yes, allow users to upload big files as well
There was a problem hiding this comment.
Today at least backend storage @taratatach. For some customers, we may want to use S3 backend instead of swift (because of secnumcloud certification or things like that)
Having an S3 compatible API can be cool, but I don't know the cost of having the 2 API.
| - Same transparent approach can be used | ||
| - Observe S3 protocol limits: a single object maxes out at 5TB, uploads can have at most 10,000 parts, and each part must be between 5MB and 5GB (last part can be smaller) | ||
| - Pick part sizes small enough (and configurable) so the 10,000-part limit still covers the largest supported file; anything larger than 5TB must be chunked at the application level because the S3 API itself forbids it | ||
| - **Note on checksums:** Like Swift SLO, S3 multipart ETags are not MD5 hashes of the content (they're a hash of part ETags). Application-side MD5 computation will be required for S3 multipart uploads, using the same pattern as Swift SLO |
There was a problem hiding this comment.
Isn't there a small mistake here?
| - **Note on checksums:** Like Swift SLO, S3 multipart ETags are not MD5 hashes of the content (they're a hash of part ETags). Application-side MD5 computation will be required for S3 multipart uploads, using the same pattern as Swift SLO | |
| - **Note on checksums:** Like Swift SLO, S3 multipart ETags are not MD5 hashes of the content (they're a hash of part content). Application-side MD5 computation will be required for S3 multipart uploads, using the same pattern as Swift SLO |
If not, I don't really understand.
|
|
||
| The `CreateFile` method will be extended to: | ||
| 1. Read segment size and SLO threshold from configuration | ||
| 2. Determine whether to use SLO based on file size (files larger than threshold) or streaming mode (unknown size, indicated by negative `ByteSize`) |
There was a problem hiding this comment.
Do we have to know the file size before hand to use SLO?
There was a problem hiding this comment.
Reading the "quota enforcement" paragraph it seems that we don't but then I don't really understand this sentence.
There was a problem hiding this comment.
At least we know it from the request header, but it's not 100%, we can rely on it at first to what method to use but not for quota, we need to calculate it on the fly as well
|
|
||
| #### 3. MD5/Checksum Handling | ||
|
|
||
| **Important**: SLO manifests don't return a single MD5 hash like regular objects. The manifest's ETag is a hash of the segment ETags, not the content. |
There was a problem hiding this comment.
Again you're saying here that the manifest's ETag is a hash of ETags. If that's not a mistake (i.e. we should not read "hash of segment") then what's the use of that manifest ETag?
There was a problem hiding this comment.
it's a MD5( segment1_etag + segment2_etag + ... )
| We will implement application-side MD5 computation: | ||
|
|
||
| 1. Create a new `swiftLargeFileCreationV3` struct that holds an MD5 hasher alongside the Swift file writer | ||
| 2. On each `Write()` call, update the MD5 hash before passing data to Swift |
There was a problem hiding this comment.
I'm curious about how we can update the hash without the full file content.
There was a problem hiding this comment.
You don't need the full file content to calculate a hash with MD5; you do it in chunks
|
|
||
| **Failure Scenarios:** | ||
|
|
||
| 1. **Client disconnects mid-upload**: The `swiftLargeFileCreationV3.Close()` is never called; segments remain orphaned |
There was a problem hiding this comment.
Can't we call Close() in a defer function so it's called every time the handler finishes whether it's because the upload is done or the connection was closed?
There was a problem hiding this comment.
Yes, and we have to do it
| - On upload error: immediate best-effort delete | ||
| - On server startup: schedule GC job | ||
| - Periodically: run GC worker (configurable interval) | ||
| - Manual: `cozy-stack swift gc-segments` CLI command |
There was a problem hiding this comment.
We have a cozy-stack check fs command that runs some consistency checks already. We might want to add the segments check to it.
There was a problem hiding this comment.
yes, it can be part of check fs, but cleanup is destructive, and should be explicit
| **Deletion pattern:** | ||
|
|
||
| We will implement a `deleteObject` helper method that: | ||
| 1. First attempts `LargeObjectDelete` (which handles both SLO manifests with their segments and regular objects) |
There was a problem hiding this comment.
Do you think this could add a lot of overhead when deleting many files? Would it be better to use SLO for all files even ones which are smaller that the threshold?
There was a problem hiding this comment.
We can use also ObjectDelete for all object, for SLO it deletes only manifest and content will be deleted after with GC. But I'm a little but afraid of using GC for every file and prefer keep for extrodinay cases and even half manual.
There is also variants to store flag in file doc
| Swift doesn't support copying SLO manifests directly. Available options: | ||
| 1. **Copy manifest content and update segment references** - Complex: requires parsing manifest JSON, copying each segment individually, updating references | ||
| 2. **Download and re-upload** - Simple but slow: streams entire file through the server | ||
| 3. **Copy segments individually then create new manifest** - Medium complexity: server-side segment copy + new manifest creation |
There was a problem hiding this comment.
Does Swift provide helpers to create manifests from a list of segments?
There was a problem hiding this comment.
Yep, there is and API to create manifest
|
Do we have to change what we store in |
| - Requires new HTTP API endpoints (`POST /files/chunks/start`, `PUT /files/chunks/{id}`, `POST /files/chunks/{id}/complete`) | ||
| - Significant UI/client changes required | ||
| - Server must track upload sessions and handle cleanup of incomplete uploads | ||
| - Adds complexity to CouchDB (need to track chunk metadata) |
There was a problem hiding this comment.
Couldn't we create a SLO manifest and create chunks by hand in this case? I'm just asking because I think adding this functionality later could be nice (for resuming uploads, parallel uploads…).
There was a problem hiding this comment.
Yes, when we are ready to do it, we can create segments separately and then create a manifest with uploaded chunks
The manifest path is the object path; we don't need to change anything, that's the main benefit compared to "manual chunking" |
No description provided.