Tailoring & apply-output fixes (filename collisions, fabrication watchlist, cover-letter PDFs, crash recovery)#61
Open
sebastianmukuria wants to merge 7 commits into
Conversation
- make_filename_prefix() appends a URL hash so two same-title/same-board jobs no longer overwrite each other's tailored resume/cover letter - build_prompt copies uploads into APPLY_WORKER_DIR/worker-N/current (was a single shared path that workers raced on); reset_worker_dir now runs before build_prompt so it doesn't wipe the just-copied PDFs Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Split watchlist into EXACT_TERMS (word-boundary) and PREFIX_TERMS ('certif')
- find_watchlist_hits() skips any term the candidate lists in skills_boundary
and uses regex boundaries so 'scala'/'rails' no longer fire on
'scalable'/'guardrails'; c++ and c# are now actually checked
- FABRICATION_WATCHLIST kept as an alias for the tailor.py import
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
convert_to_pdf ran letters through the resume parser, which dropped the body and rendered the salutation as the candidate's name. Add convert_letter_to_pdf + _letter_html (paragraphs under a name header, HTML-escaped) and use it in run_cover_letters. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- pipeline _run_tailor/_run_cover pass limit=0 (unlimited) - run_cover_letters builds its LIMIT clause conditionally; a literal LIMIT 0 would have returned zero rows (get_jobs_by_stage already handles limit<=0) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add reset_stale_locks(), called once at apply startup before any worker spawns. A crash previously left rows in_progress forever (acquire_job skips them, reset_failed won't touch them). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Wrap both per-site execution paths (parallel future.result() and the sequential loop) in try/except so one timing-out site logs a warning and the stage continues instead of aborting the remaining sites. Report an errors count in the stats. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Tailoring and apply-output correctness fixes from a pre-flight review. 3 of 4 focused PRs; independent and reviewable on its own.
What & why
{site}_{title}, so two "Software Engineer" postings from the same board overwrote each other — and both DB rows pointed at the same file, so employer A could receive the resume tailored for employer B. Filenames now include a short URL hash. Separately, every apply worker copied its upload to one shared path; uploads now go to a per-worker directory. (scoring/tailor.py,scoring/cover_letter.py,apply/prompt.py,apply/launcher.py)skills_boundary. (scoring/validator.py)scoring/pdf.py,scoring/cover_letter.py)runno longer silently caps tailoring/cover letters at 20 jobs. (pipeline.py,scoring/cover_letter.py)in_progressby a killed run are now cleared at apply startup. (apply/launcher.py)discovery/smartextract.py)Tests
Adds
tests/test_filenames.py,tests/test_validator_watchlist.py,tests/test_cover_pdf.py,tests/test_stage_limits.py,tests/test_stale_locks.py,tests/test_smartextract_isolation.py(16 tests, all passing). CHANGELOG updated under[Unreleased].