-
-
Notifications
You must be signed in to change notification settings - Fork 0
Fix build hangs: Add timeouts, heartbeat, and parallel bash patches #109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix build hangs: Add timeouts, heartbeat, and parallel bash patches #109
Conversation
…ility.
* **Global (`scripts/build_engine/defaults.sh`):** Added robust `CURL_OPTIONS` with timeouts (`--connect-timeout 10`, `--max-time 600`) and retries to `termux_step_get_source`. This prevents silent hangs in all packages (including `termux-elf-cleaner`).
* **Bash (`packages/b/bash/build.sh`):**
* Optimized patch download to run in parallel using `xargs -P 10`.
* Added explicit timeouts to `curl` for patch downloads.
* Ensured sequential application of patches after parallel download.
* **Orchestrator (`scripts/repoexe/builder.py`):**
* Replaced blocking `subprocess.run` with `subprocess.Popen` loop.
* Added a heartbeat log message every 30 seconds to stdout to indicate the build is still active, solving the "it wasn't doing anything" UX issue.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
* **Global (`scripts/build_engine/defaults.sh`):** Added `CURL_OPTIONS` with timeouts (`--connect-timeout 10`, `--max-time 600`) to source downloads to prevent hangs.
* **Bash (`packages/b/bash/build.sh`):** Parallelized patch download using `xargs -P 10` with explicit timeouts.
* **Orchestrator (`scripts/repoexe/builder.py`):**
* Switched to `subprocess.Popen` with pipe streaming to print build output to console in real-time.
* Implemented a 60-minute hard timeout per build to prevent infinite hangs.
* Used `select` to safely handle non-blocking reads from subprocess.
* **Runtime Tests (`scripts/utils/run_docker_test.sh`):**
* Fixed infinite recursion bug where `exec ... dash` re-executed the same script indefinitely if `SHELL` was already `dash`.
* Added `DEBIAN_FRONTEND=noninteractive` to prevent interactive prompts from hanging the test inside Docker.
* **Builder (`scripts/repoexe/builder.py`):**
* Enhanced timeout logic to prevent busy loops and ensure proper process termination.
* Used `select` for non-blocking I/O to ensure build logs are streamed to stdout in real-time, aiding debugging.
* Enforced a strict 60-minute timeout for builds.
* **Global Safeguards:**
* Ensured `curl` timeouts are applied in `defaults.sh` and `bash/build.sh`.
* **Runtime Tests (`scripts/utils/run_docker_test.sh`):**
* Added `--user 0:0` to `docker run` to force root execution. This resolves "Permission denied" errors during `dpkg` installation when downgrading packages like `bash`, where file ownership conflicts occur with the default container user.
* Guarded the `exec ... dash` logic to prevent infinite recursion if the current shell is already `dash`.
* Set `DEBIAN_FRONTEND=noninteractive` to prevent interactive prompts from hanging the test.
* **Builder (`scripts/repoexe/builder.py`):**
* Refined timeout loop to correctly handle EOF and ensure `process.wait()` is called, preventing zombies and busy loops.
* Utilized `select` for robust non-blocking output streaming to console.
* Enforced a 60-minute build timeout.
* **Global Safeguards:**
* Applied robust `CURL_OPTIONS` (timeouts/retries) in `defaults.sh` and `bash/build.sh` to prevent network hangs.
* **Runtime Tests (`scripts/utils/run_docker_test.sh`):**
* Overrode Docker entrypoint to `/bin/sh` to bypass any default user configuration in the image.
* Ensured root execution (`--user 0:0`) to allow `dpkg` to overwrite system files during package installation/downgrade.
* **Bash Package (`packages/b/bash/build.sh`):**
* Updated `TERMUX_PKG_VERSION` to `5.2.37` to match the patch level. This ensures the package is treated as an upgrade/equal version rather than a downgrade (vs 5.2.15 in image), though the root fix is the primary permission solver.
This PR addresses the issue where
bashandtermux-elf-cleanerbuilds would hang indefinitely.Changes:
defaults.shnow enforces a 10-minute max time and 10-second connection timeout for all source downloads. This safeguards against zombie connections.builder.pynow prints a "Still building..." status message every 30 seconds, providing feedback during long compilation steps.These changes ensure that network issues fail fast rather than hanging, and provide the user with confidence that the build process is proceeding.
PR created automatically by Jules for task 5215838881747980311 started by @SjnExe