Fix /pdf processing flow: use parsed arxiv_id, restore sync processing, and return paper_id by ponyfly6 · Pull Request #1 · ponyfly6/coding-agent

ponyfly6 · 2026-02-12T03:32:58Z

Tests were failing because _handle_pdf_command sometimes returned None and used args[1] directly for the arXiv id, which mis-parsed arguments when flags were present.
The CLI /pdf flow should provide immediate database updates and a paper_id for predictable behavior in synchronous CLI usage.

Use the previously parsed arxiv_id variable (arxiv_id_arg = arxiv_id) instead of reading args[1] directly to avoid mis-parsing flags and positional arguments.
Restore synchronous PDF processing as the default when pdf_processing_method is not GeminiAsync, performing tools.extract_text_from_pdf_gemini, saving the blob via tools.save_text_blob, preparing self.pending_pdf_context, updating processed_timestamp and status, and returning the created paper_id.
Keep the async ingestion path available behind pdf_processing_method = "GeminiAsync" and ensure the function returns the paper_id in the async path as well.
Add explicit DB status updates for error cases (e.g., error_extraction, error_blob, error_processing, error_file_not_found) and improve logging/error handling around synchronous processing.

Ran pytest -q after the changes, and the suite passed with 27 passed, 1 skipped.
Verified the failing integration tests for /pdf previously triggered are now passing under the synchronous path.

Fix /pdf processing flow and return paper id

40ba32b

ponyfly6 added the codex label Feb 12, 2026 — with ChatGPT Codex Connector

Provide feedback