Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
sayakpaul
left a comment
There was a problem hiding this comment.
Should we add a test as well? No strong opinions, of course.
| if file_count > 200: | ||
| print( | ||
| f"⚠️ Found {file_count} files to upload, which exceeds the 200 file limit for a single commit. Deleting old build files and re-uploading the whole build folder to avoid hitting file limits." | ||
| ) | ||
| kernel_root_dir = build_dir.parent | ||
| api.upload_large_folder( | ||
| repo_id=repo_id, | ||
| folder_path=kernel_root_dir, | ||
| revision=branch, | ||
| repo_type="model", | ||
| allow_patterns=["build/torch*"], | ||
| ) |
There was a problem hiding this comment.
We should not just remove old build files, since it can break our version contract. Also, if we automatically switch to a different upload type, the behavior should be exactly the same.
Is it possible to add delete_patterns to upload_large_folder?
| if p.is_file() and p.relative_to(build_dir).as_posix().startswith("torch") | ||
| ) | ||
|
|
||
| if file_count > 200: |
There was a problem hiding this comment.
Docs:
When dealing with a large folder (thousands of files or hundreds of GB), we recommend using upload_large_folder() instead.
| delete_patterns=list(delete_patterns), | ||
| commit_message="Build uploaded using `kernels`.", | ||
| allow_patterns=["torch*"], | ||
| file_count = sum( |
There was a problem hiding this comment.
I think this should be a separate function.
This PR adds a path to use the
upload_large_folderapi when there are more than 200 files in the build output. This helps avoid timeouts when many files are in the build. Otherwise the normalupload_folderis preferred since it has a bit more flexibility arounddelete_patternsand thecommit_message