Conversation
This change introduces a new failure handling system for collection tasks. - Created a new package `pkg/failures` to manage failure reporting, including a `Manager` to handle the lifecycle of a failure report, and `Failure` and `FailureReport` structs for storing failure data. The manager creates a `.borg-failures/<timestamp>` directory for each run, containing a `failures.json` report and a `retry.sh` script. - Added a `borg failures` command with `show` and `clear` subcommands to manage failure reports. - Added a `borg retry` command to retry failed collections. - Added `--on-failure` and `--failures-dir` flags to the `collect` command. - Refactored the `collect github repo` command to make the single-repository cloning logic reusable. - Updated the `collect github repos` command to use the reusable cloning function and implement failure handling, including the `--on-failure=stop` and `--on-failure=prompt` options. - Implemented failure categorization to distinguish between retryable and permanent failures. - Implemented tracking of the number of attempts for each failed item. - Created a placeholder file for a missing asset to fix the build. Co-authored-by: Snider <631881+Snider@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @Snider, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the application's resilience and error handling capabilities by introducing a robust failure reporting and dead letter queue system. It provides users with greater control over how collection failures are managed, offering options to continue, stop, or interactively prompt for retries. The new Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable failure reporting and dead letter queue feature, along with new failures and retry commands. The refactoring of the repository collection logic is a good step towards better code structure.
My review has identified a critical bug where collecting multiple repositories results in files being overwritten. I've also provided several suggestions to improve the robustness and maintainability of the new feature, such as using more reliable error handling, improving user input prompts, and making the failure reporting and retry mechanisms more generic to support future collection types. Additionally, I've noted an inconsistency where failure reporting is not applied to the single repository collection command.
| if outputFile == "" { | ||
| outputFile = "repo." + format | ||
| if compression != "none" { | ||
| outputFile += "." + compression | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation for generating a default output filename will cause issues when collecting multiple repositories, as each collected repository will overwrite the previous one. The filename repo.<format> is not unique. The default filename should be derived from the repository URL to ensure uniqueness.
Note: This change requires importing the path and strings packages.
if outputFile == "" {
// Generate a unique filename from the repo URL to avoid overwriting files.
// e.g., https://github.com/owner/name.git -> name.<format>
base := path.Base(repoURL)
repoName := strings.TrimSuffix(base, path.Ext(base))
outputFile = repoName + "." + format
if compression != "none" {
outputFile += "." + compression
}
}|
|
||
| fmt.Fprintln(cmd.OutOrStdout(), "Repository saved to", outputFile) | ||
| return nil | ||
| return collectRepo(repoURL, outputFile, format, compression, password, cmd) |
There was a problem hiding this comment.
For consistency across the tool, the collect-github-repo command should also implement failure reporting, similar to how it's done for collect-github-repos. Currently, if this command fails, no failure report is generated, which might be unexpected for users given the new --on-failure and --failures-dir flags on the parent collect command.
| fmt.Fprintln(cmd.OutOrStdout(), "Collecting", repo) | ||
| err := collectRepo(repo, "", "datanode", "none", "", cmd) | ||
| if err != nil { | ||
| retryable := !strings.Contains(err.Error(), "not found") |
There was a problem hiding this comment.
Determining if an error is retryable by checking for the substring "not found" in the error message is brittle and not reliable. A better approach would be to use typed errors. The function that performs the cloning (GitCloner.CloneGitRepository) could return a specific error type (e.g., vcs.ErrRepositoryNotFound) when a repository is not found. This would allow for a more robust check using errors.Is.
| var response string | ||
| fmt.Scanln(&response) |
There was a problem hiding this comment.
Using fmt.Scanln for user input can be fragile. It doesn't handle errors and can leave unread characters in the input buffer if the user enters more than one word, which might affect subsequent operations. For more robust interactive prompts, consider using bufio.NewScanner(os.Stdin) to read a full line of input and then process it.
Note: This change requires importing the bufio and os packages.
var response string
scanner := bufio.NewScanner(os.Stdin)
if scanner.Scan() {
response = scanner.Text()
}| if failuresDir == "" { | ||
| failuresDir = ".borg-failures" | ||
| } |
There was a problem hiding this comment.
The default value for the failures directory, ".borg-failures", is hardcoded here and in several other places (cmd/collect.go, pkg/failures/manager.go). This makes it difficult to update the default value consistently. It would be better to define this as a public constant in the pkg/failures package and reuse it across the application.
| if failuresDir == "" { | |
| failuresDir = ".borg-failures" | |
| } | |
| if failuresDir == "" { | |
| failuresDir = failures.DefaultFailuresDir // Assuming DefaultFailuresDir is the new constant | |
| } |
| } | ||
|
|
||
| fmt.Printf("Retrying %s...\n", failure.URL) | ||
| retryCmd := exec.Command("borg", "collect", "github", "repo", failure.URL) |
There was a problem hiding this comment.
The retry command is hardcoded to execute borg collect github repo. This limits its utility to only one type of collection. To make this feature more scalable, the command for retrying should be constructed dynamically. The FailureReport could be extended to include the necessary information to reconstruct the original failed command.
| for _, failure := range m.report.Failures { | ||
| retryScript.WriteString(fmt.Sprintf("borg collect github repo %s\n", failure.URL)) | ||
| } |
There was a problem hiding this comment.
The generated retry.sh script hardcodes the borg collect github repo command. This couples the failure manager to a specific collection command and limits the utility of the generated script. The script should be generated based on the actual command that failed, making the failure reporting mechanism more generic and scalable.
This submission implements a new failure reporting and dead letter queue feature. It includes a new
failurespackage,failuresandretrycommands, and integration with thecollectcommands. It also includes refactoring of the single-repository cloning logic and implementation of failure categorization and attempt tracking.Fixes #55
PR created automatically by Jules for task 8552583492686157106 started by @Snider