Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to make ATOM server logs more visible in CI by reducing Python output buffering and streaming the server log during the accuracy and benchmark runs.
Changes:
- Launch server with
PYTHONUNBUFFERED=1to improve real-time log flushing. - Stream
/tmp/atom_server.login the background duringaccuracyandbenchmarkmodes, with cleanup on exit.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Replace tee with file redirect + tail -f in launch to fix PID capture and eliminate SIGPIPE risk when docker exec session ends - Add tail -f in launch section for real-time server log streaming - Restore $RTL_CMD that was accidentally dropped - Add kill $tail_pid after benchmark completes - Add 'Dump server log' step (if: always()) to both accuracy and benchmark workflows as safety net for CI timeout scenarios
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| PYTHONUNBUFFERED=1 $RTL_CMD python -m atom.entrypoints.openai_server --model "$MODEL_PATH" $PROFILER_ARGS "${EXTRA_ARGS[@]}" > "$ATOM_SERVER_LOG" 2>&1 & | ||
| atom_server_pid=$! | ||
| tail -f "$ATOM_SERVER_LOG" & |
There was a problem hiding this comment.
tail -f "$ATOM_SERVER_LOG" & in the launch path can produce unbounded CI output during the potentially long health/warmup loops (up to ~35 minutes) and isn't cleaned up. Consider removing the background tail from launch, or limiting it (e.g., tail -n <N> -f), and ensure the tail process is terminated (trap/--pid=$atom_server_pid) so it can’t keep running if launch exits early.
| PYTHONUNBUFFERED=1 $RTL_CMD python -m atom.entrypoints.openai_server --model "$MODEL_PATH" $PROFILER_ARGS "${EXTRA_ARGS[@]}" > "$ATOM_SERVER_LOG" 2>&1 & | |
| atom_server_pid=$! | |
| tail -f "$ATOM_SERVER_LOG" & | |
| tail_pid="" | |
| cleanup_atom_server_tail() { | |
| if [ -n "${tail_pid:-}" ] && kill -0 "$tail_pid" 2>/dev/null; then | |
| kill "$tail_pid" 2>/dev/null || true | |
| wait "$tail_pid" 2>/dev/null || true | |
| fi | |
| } | |
| trap cleanup_atom_server_tail EXIT | |
| PYTHONUNBUFFERED=1 $RTL_CMD python -m atom.entrypoints.openai_server --model "$MODEL_PATH" $PROFILER_ARGS "${EXTRA_ARGS[@]}" > "$ATOM_SERVER_LOG" 2>&1 & | |
| atom_server_pid=$! | |
| tail -n 200 -f --pid="$atom_server_pid" "$ATOM_SERVER_LOG" & | |
| tail_pid=$! |
| - name: Dump server log | ||
| if: always() | ||
| run: | | ||
| docker exec "$CONTAINER_NAME" cat /tmp/atom_server.log 2>/dev/null || true |
There was a problem hiding this comment.
This step prints the entire /tmp/atom_server.log file, which can get very large and overwhelm GitHub Actions logs. Consider printing only the last N lines (and/or only on failure) to keep logs readable and avoid hitting log size limits.
| docker exec "$CONTAINER_NAME" cat /tmp/atom_server.log 2>/dev/null || true | |
| docker exec "$CONTAINER_NAME" tail -n 200 /tmp/atom_server.log 2>/dev/null || true |
| - name: Dump server log | ||
| if: always() | ||
| run: | | ||
| docker exec atom-benchmark cat /tmp/atom_server.log 2>/dev/null || true |
There was a problem hiding this comment.
This step cats the full /tmp/atom_server.log, which may be very large and could exceed GitHub Actions log limits. Consider printing a bounded tail (e.g., last N lines) and/or gating it to failure-only to reduce noise.
| docker exec atom-benchmark cat /tmp/atom_server.log 2>/dev/null || true | |
| echo "Last 200 lines of /tmp/atom_server.log:" | |
| docker exec atom-benchmark tail -n 200 /tmp/atom_server.log 2>/dev/null || true |
- Server log: /tmp/atom_server.log (direct file write, no pipe) - Client log: /tmp/atom_client.log (lm_eval/benchmark output via tee) - Launch: tail -f for startup visibility with trap+kill cleanup on all exit paths (crash/timeout/warmup failure/success) - Accuracy/benchmark: no tail -f, only client output visible in step - Add "Dump server log" and "Dump client log" steps (if: always()) to both workflows for post-hoc debugging
Motivation
Technical Details
Test Plan
Test Result
Submission Checklist