-
Notifications
You must be signed in to change notification settings - Fork 0
GitHub release assets download #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit introduces the ability to download release assets from GitHub. It adds two new subcommands: `borg collect github releases` to download all releases for a repository, and `borg collect github release` to download a specific release. Both commands support the following options: * `--assets-only`: Skip release notes and only download the assets. * `--pattern`: Filter assets by a filename pattern. * `--verify-checksums`: Verify the checksums of the downloaded assets. To handle large binary files efficiently, the download logic has been refactored to stream the assets directly to disk, avoiding loading the entire file into memory. The commit also includes: * Unit tests for the new subcommands and their options. * Updated tests for the `pkg/github` package to reflect the new streaming download implementation. * A fix for the `collect_github_release` example to work with the new streaming download implementation. I have been unable to get all the tests to pass due to issues with mocking and the test environment setup. I believe I am very close to a solution, but I have exhausted my attempts. Co-authored-by: Snider <631881+Snider@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @Snider, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the tool's ability to interact with GitHub releases. It introduces robust mechanisms for downloading release assets, including support for streaming data directly to files and verifying their integrity using SHA256 checksums. Users can now download assets from a specific release (by tag or latest) with filtering options, or retrieve assets from all releases of a repository. These changes enhance the flexibility and reliability of collecting artifacts from GitHub. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces significant new functionality for downloading GitHub release assets, including streaming for large files and checksum verification. The changes are well-structured, with new commands and updates to the core GitHub package. My review focuses on a few key areas for improvement. I've identified some performance issues in how checksum files are handled within loops, a bug related to an unused command-line flag, and some critical issues in the new test files that will prevent them from compiling or running correctly. I've also noted a minor formatting issue. Addressing these points will improve the robustness and efficiency of the new features.
| mockRoundTripper := &mocks.MockRoundTripper{} | ||
| mockHttpClient := &http.Client{Transport: mockRoundTripper} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mock HTTP client setup is incorrect and will cause all tests in this file to fail. The responses map is defined but never loaded into the mockRoundTripper. Consequently, the mock server will return 404 for all requests.
Additionally, the call mockHttpClient.SetResponses(responses) in the VerifyChecksumsFailed subtest (line 147) will cause a compilation error because *http.Client does not have a SetResponses method.
To fix this, you should initialize the mock client correctly by setting the responses on the round tripper, and then update the responses via the client's transport in the subtest.
| StatusCode: http.StatusOK, | ||
| Body: io.NopCloser(bytes.NewBufferString(badChecksumsFileContent)), | ||
| } | ||
| mockHttpClient.SetResponses(responses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line will cause a compilation error because *http.Client does not have a SetResponses method. This method is part of your *mocks.MockRoundTripper.
To fix this, you need to access the client's transport, cast it, and then call the method.
| mockHttpClient.SetResponses(responses) | |
| mockHttpClient.Transport.(*mocks.MockRoundTripper).SetResponses(responses) |
| if verifyChecksums { | ||
| checksumAsset := findChecksumAsset(release.Assets) | ||
| if checksumAsset == nil { | ||
| log.Warn("checksum file not found in release", "tag", tag) | ||
| } else { | ||
| buf := new(bytes.Buffer) | ||
| err := borg_github.DownloadReleaseAsset(checksumAsset, buf) | ||
| if err != nil { | ||
| log.Error("failed to download checksum file", "name", checksumAsset.GetName(), "err", err) | ||
| } else { | ||
| checksumData = buf.Bytes() | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code for finding and downloading the checksum asset is inside the loop that iterates over all release assets. This is inefficient as it will re-download and process the checksum file for every single asset in the release.
This logic should be moved outside and before the for _, asset := range release.Assets loop to be executed only once per release.
| assetsOnly, _ := cmd.Flags().GetBool("assets-only") | ||
| pattern, _ := cmd.Flags().GetString("pattern") | ||
| verifyChecksums, _ := cmd.Flags().GetBool("verify-checksums") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if verifyChecksums { | ||
| checksumAsset := findChecksumAsset(release.Assets) | ||
| if checksumAsset == nil { | ||
| log.Warn("checksum file not found in release", "tag", tag) | ||
| } else { | ||
| buf := new(bytes.Buffer) | ||
| err := borg_github.DownloadReleaseAsset(checksumAsset, buf) | ||
| if err != nil { | ||
| log.Error("failed to download checksum file", "name", checksumAsset.GetName(), "err", err) | ||
| } else { | ||
| checksumData = buf.Bytes() | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code for finding and downloading the checksum asset is inside the loop that iterates over all release assets. This is inefficient as it will re-download and process the checksum file for every single asset in the release.
This logic should be moved outside and before the for _, asset := range release.Assets loop to be executed only once per release.
| assetPath := filepath.Join(releaseDir, asset.GetName()) | ||
| file, err := os.Create(assetPath) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to create asset file: %w", err) | ||
| } | ||
| defer file.Close() | ||
|
|
||
| err = os.WriteFile(outputFile, tar, 0644) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to write datanode: %w", err) | ||
| } | ||
| log.Info("datanode saved", "path", outputFile) | ||
| } else { | ||
| if len(release.Assets) == 0 { | ||
| log.Info("no assets found in the latest release") | ||
| return nil, nil | ||
| } | ||
| var assetToDownload *github.ReleaseAsset | ||
| if file != "" { | ||
| for _, asset := range release.Assets { | ||
| if asset.GetName() == file { | ||
| assetToDownload = asset | ||
| break | ||
| var checksumData []byte | ||
| if verifyChecksums { | ||
| checksumAsset := findChecksumAsset(release.Assets) | ||
| if checksumAsset == nil { | ||
| log.Warn("checksum file not found in release", "tag", tag) | ||
| } else { | ||
| buf := new(bytes.Buffer) | ||
| err := borg_github.DownloadReleaseAsset(checksumAsset, buf) | ||
| if err != nil { | ||
| log.Error("failed to download checksum file", "name", checksumAsset.GetName(), "err", err) | ||
| } else { | ||
| checksumData = buf.Bytes() | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if verifyChecksums && checksumData != nil { | ||
| err = borg_github.DownloadReleaseAssetWithChecksum(asset, checksumData, file) | ||
| } else { | ||
| err = borg_github.DownloadReleaseAsset(asset, file) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit introduces the ability to download release assets from GitHub, including support for streaming large files and verifying checksums. It also includes updated tests and a fix for the
collect_github_releaseexample.Fixes #24
PR created automatically by Jules for task 12070949195723744605 started by @Snider