-
Notifications
You must be signed in to change notification settings - Fork 116
feat: migrate LFS API to new object storage implementation #1773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: migrate LFS API to new object storage implementation #1773
Conversation
Replace legacy LFS object storage with unified ObjectStorage interface. Remove chunk upload logic and add comprehensive test coverage. - Implement ObjectStorageAdapter for LFS operations - Update LFS handlers to use new storage interface - Add unit and integration tests - Maintain backward compatibility with existing object paths - Support S3-compatible services (MinIO) with path-style addressing Related to web3infra-foundation#1752 Signed-off-by: Hongze Gao <15101764808@163.com>
Resolve checks error. Signed-off-by: Hongze Gao <15101764808@163.com>
Resolve checks error. Signed-off-by: Hongze Gao <15101764808@163.com>
549684e to
329aa87
Compare
Resolve checks error. Signed-off-by: Hongze Gao <15101764808@163.com>
| async fn exist_object(&self, object_id: &str) -> bool { | ||
| exist_object(self.config.lfs_file_path.clone(), object_id) | ||
| async fn exist_object(&self, object_id: &str, enable_split: bool) -> bool { | ||
| if enable_split { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段代码已经被删除,请检查代码合并流程
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议删除lfs_storage 目录,统一使用object storage
| ObjectByteStream, ObjectKey, ObjectMeta, ObjectNamespace, ObjectStorage, | ||
| }; | ||
|
|
||
| /// Transform LFS object SHA256 to old path format for compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不用考虑文件迁移的兼容性
| &id[6..] | ||
| ) | ||
| match key.namespace { | ||
| ObjectNamespace::Lfs => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同理,不考虑文件的兼容性,统一使用3级目录
| /// An empty vector is always returned when querying relations so that callers can | ||
| /// gracefully degrade without changing their logic. New code should migrate | ||
| /// away from the split-file feature entirely. | ||
| pub struct LfsSplitRelation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经被删除,请检查代码合并流程
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确认是否修改了文件格式
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 77a9965a30
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let lfs_object_storage = ObjectStorageFactory::create(ObjectStorageConfig::from_config( | ||
| config.lfs.storage_type.clone(), | ||
| config.clone(), | ||
| )) | ||
| .await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use configured LFS local path for LocalFs storage
The LFS object storage is now created via ObjectStorageFactory::from_config, which uses config.base_dir as the root for LocalFs. This ignores config.lfs.local.lfs_file_path (defaulting to <base_dir>/lfs). On LocalFs deployments that already have LFS data under the configured LFS path (or any custom lfs_file_path), the server will now look in <base_dir>/objects instead, so HEAD/GET/download will return 404 even though the objects exist. New uploads will also be written to the new location. Consider passing lfs.local.lfs_file_path into the ObjectStorageConfig or migrating/aliasing the old directory.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
| /// Transform LFS object SHA256 to old path format for compatibility. | ||
| /// old rule: {sha1[0..2]}/{sha1[2..4]}/{sha1[4..]} | ||
| fn transform_lfs_path(sha1: &str) -> String { | ||
| if sha1.len() < 5 { | ||
| sha1.to_string() | ||
| } else { | ||
| Path::new(&sha1[0..2]) | ||
| .join(&sha1[2..4]) | ||
| .join(&sha1[4..sha1.len()]) | ||
| .into_os_string() | ||
| .into_string() | ||
| .unwrap() | ||
| } | ||
| } |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function parameter and variable are named sha1, but LFS uses SHA-256 hashing, not SHA-1. This naming is misleading. Consider renaming the parameter to hash, oid, or sha256 to accurately reflect what it represents.
|
|
||
| self.storage.put(&key, stream, meta).await | ||
| } | ||
|
|
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter _enable_split is prefixed with underscore to indicate it's unused, which is correct. However, the documentation for this method or trait should clarify that the split feature has been deprecated/removed, so future developers understand why this parameter is ignored. Consider adding a comment explaining that this parameter is kept for interface compatibility but is no longer used since the chunk/split feature was removed.
| // NOTE: `_enable_split` is intentionally ignored. The LFS chunk/split feature was | |
| // removed, but this parameter is kept for `LfsFileStorage` interface compatibility. | |
| // When/if the trait is revised to drop split support entirely, this argument can | |
| // be removed from the signature. |
| async fn create_test_state() -> (MonoApiServiceState, tempfile::TempDir) { | ||
| let temp_dir = tempdir().unwrap(); | ||
| let storage = test_storage(temp_dir.path()).await; | ||
|
|
||
| // Create a Redis connection for GitObjectCache | ||
| // Note: This requires Redis to be running at localhost:6379 | ||
| // Since tests are #[ignore], this is acceptable | ||
| let redis_config = common::config::RedisConfig { | ||
| url: "redis://127.0.0.1:6379".to_string(), | ||
| }; | ||
| let connection = init_connection(&redis_config).await; | ||
|
|
||
| let state = MonoApiServiceState { | ||
| storage: storage.clone(), | ||
| oauth_client: None, | ||
| session_store: None, | ||
| listen_addr: "http://localhost:8000".to_string(), | ||
| entity_store: saturn::entitystore::EntityStore::new(), | ||
| git_object_cache: Arc::new(ceres::api_service::cache::GitObjectCache { | ||
| connection, | ||
| prefix: "git-object-bincode".to_string(), | ||
| }), | ||
| }; | ||
|
|
||
| (state, temp_dir) | ||
| } |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test setup function has multiple .unwrap() calls (lines 28, 37) that can panic if setup fails. While this is acceptable for test code, consider using more descriptive error messages with .expect() to make test failures easier to debug. For example: .expect("Failed to create temporary directory for test") and .expect("Failed to initialize Redis connection").
| let res = file_storage.put_object(&meta.oid, body_bytes).await; | ||
| if res.is_err() { | ||
| if let Err(e) = res { | ||
| lfs_delete_meta(&db_storage, req_obj).await.unwrap(); |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .unwrap() on line 279 can panic if deleting the metadata fails. Since this is error-path cleanup code, it would be better to log the error rather than panicking. Consider using if let Err(e) = lfs_delete_meta(...).await { tracing::error!("Failed to cleanup metadata: {}", e); } to handle this gracefully.
| lfs_delete_meta(&db_storage, req_obj).await.unwrap(); | |
| if let Err(delete_err) = lfs_delete_meta(&db_storage, req_obj).await { | |
| tracing::error!( | |
| "Failed to cleanup LFS metadata for oid {} after upload failure: {}", | |
| meta.oid, | |
| delete_err | |
| ); | |
| } |
| }; | ||
| // because return type must be `ReceiverStream`, so we need to wrap the bytes into a stream. | ||
| let (tx, rx) = tokio::sync::mpsc::channel(1); | ||
| tx.send(Ok(bytes)).await.unwrap(); |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .unwrap() on line 312 can panic if sending through the channel fails. While this is unlikely since we just created the channel, it's better to handle this error case. Consider using .expect("Channel send should not fail for newly created channel") with a descriptive message or properly propagating the error.
| tx.send(Ok(bytes)).await.unwrap(); | |
| if let Err(e) = tx.send(Ok(bytes)).await { | |
| tracing::error!("Failed to send LFS object into channel: {}", e); | |
| return Err(GitLFSError::GeneralError( | |
| "Failed to prepare object stream".to_string(), | |
| )); | |
| } |
| use crate::storage::base_storage::{BaseStorage, StorageConnector}; | ||
|
|
||
| #[derive(Clone, Debug)] | ||
| /// Relationship between an original (ori) LFS object and its splitted sub object. |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error: "splitted" should be "split". The past participle of "split" is "split", not "splitted".
| /// Relationship between an original (ori) LFS object and its splitted sub object. | |
| /// Relationship between an original (ori) LFS object and its split sub object. |
| let body_bytes: Vec<u8> = req | ||
| .into_body() | ||
| .into_data_stream() | ||
| .try_fold(Vec::new(), |mut acc, chunk| async move { | ||
| acc.extend_from_slice(&chunk); | ||
| Ok(acc) | ||
| }) | ||
| .await | ||
| .unwrap(); |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .unwrap() call on line 411 can panic if the stream collection fails. This is critical since it's handling user data uploads. Consider using proper error handling and returning an appropriate error response to the client instead.
The error should be mapped through map_lfs_error to provide a proper HTTP status code and error message.
| .presigned(PresigningConfig::expires_in(expires_in).unwrap()) | ||
| .await | ||
| .map_err(|e| { | ||
| let detail = dump_error_chain(&e); | ||
| MegaError::Other(detail) | ||
| })?; | ||
| Ok(presigned_request.uri().to_string()) | ||
| } | ||
|
|
||
| /// Generate a presigned URL for uploading an object. | ||
| pub async fn put_presigned_url(&self, key: &ObjectKey) -> Result<String, MegaError> { | ||
| let expires_in = Duration::from_secs(3600); | ||
| let presigned_request = self | ||
| .client | ||
| .put_object() | ||
| .bucket(&self.bucket) | ||
| .key(s3_key(key)) | ||
| .presigned(PresigningConfig::expires_in(expires_in).unwrap()) |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .unwrap() calls on lines 166 and 183 can panic if the PresigningConfig::expires_in() fails. While this is unlikely with a valid duration, it's better to handle the error explicitly for robustness. Consider using ? operator or map_err to convert this to a MegaError.
Migrate LFS API to New Object Storage Implementation
Summary
This PR migrates the Git LFS API from the legacy object storage implementation to the new unified object storage interface. The migration maintains backward compatibility with existing LFS objects while removing deprecated chunk upload logic.
Related Issue: #1752
Changes
Core Implementation
ObjectStorageAdapterthat wraps the newObjectStoragetraitobjects/{sha256[0..2]}/{sha256[2..4]}/{sha256[4..]}preserved for compatibilityforce_path_style(true)for custom endpoints (MinIO, LocalStack, etc.)Files Modified
New Files:
jupiter/src/lfs_storage/object_storage_adapter.rs- Adapter bridging LFS interface to new ObjectStoragemono/tests/lfs_api_tests.rs- Integration tests for LFS APIdocs/lfs-test-results.md- Comprehensive test results documentationdocs/lfs-testing-guide.md- Testing guide for LFS functionalityModified Files:
jupiter/src/lfs_storage/mod.rs- Updated to use ObjectStorageAdapterjupiter/src/object_storage/rustfs_object_storage.rs- Added path-style addressing supportjupiter/src/object_storage/fs_object_storage.rs- Updated LFS path transformationceres/src/lfs/handler.rs- Updated to use new storage interface, added testsmono/src/api/router/lfs_router.rs- Updated error handling and testsRemoved Features
Testing
Test Results
All tests passed successfully:
#[ignore]for CI)Test Coverage
Core Functionality:
Compatibility:
Error Handling:
Test Commands
Compatibility
Backward Compatibility
/api/v1/lfsand/info/lfspaths work as beforegit lfs push/pullworkflows work correctlyVerification
Manual Testing Performed
git lfs pushgit lfs pullafter clearing cacheTest Environment
localhost:8000localhost:9000)megagit-lfs/3.7.1Known Limitations
git config lfs.http://.../info/lfs.locksverify falseDocumentation
docs/lfs-test-results.mddocs/lfs-testing-guide.mdChecklist
-s -SAdditional Notes
This PR completes the migration to the new object storage interface for LFS operations. The implementation provides: