Skip to content

Discovery & scoring fixes (location filter discarding all jobs, real company name, retryable scoring)#60

Open
sebastianmukuria wants to merge 4 commits into
Pickle-Pixel:mainfrom
sebastianmukuria:fix/discovery-scoring
Open

Discovery & scoring fixes (location filter discarding all jobs, real company name, retryable scoring)#60
sebastianmukuria wants to merge 4 commits into
Pickle-Pixel:mainfrom
sebastianmukuria:fix/discovery-scoring

Conversation

@sebastianmukuria

Copy link
Copy Markdown

Summary

Discovery and scoring correctness fixes from a pre-flight review of the codebase. 2 of 4 focused PRs from that review; independent and reviewable on its own.

What & why

  • The location filter discarded nearly every job by default. The discovery code read config keys (location_accept / location_reject_non_remote) that nothing — not the wizard, not the example config — ever writes; the shipped schema is location.accept_patterns / reject_patterns. With empty lists, every non-remote location was rejected (so even a remote job listed as "San Francisco, CA" was dropped). New applypilot.locfilter reads both schemas, treats an empty accept list as "keep unless explicitly rejected", and replaces the three duplicated copies in jobspy/workday/smartextract.
  • The real company name was discarded. Scoring, tailoring, cover letters, the apply prompt, and the dashboard all used job['site'] (the job board, e.g. "linkedin") because there was no company column. Added a company column (with forward migration via ensure_columns), populated it from each discovery source, and switched consumers to job.company (falling back to site for legacy rows). (database.py, discovery/*, scoring/scorer.py, scoring/cover_letter.py, scoring/tailor.py, apply/prompt.py, apply/launcher.py)
  • Scoring robustness. LLM errors and parse failures were written as a permanent fit_score=0, and since the pending filter is fit_score IS NULL, those jobs (e.g. Gemini free-tier rate limits) were never retried. Failures now leave the score NULL (retried next run); scores commit incrementally so an interrupt doesn't discard the whole batch; and markdown-decorated scores like **SCORE:** 8 / 7/10 parse correctly. (scoring/scorer.py)

Tests

Adds tests/test_locfilter.py, tests/test_company_column.py, tests/test_scoring.py (17 tests, all passing). CHANGELOG updated under [Unreleased].

sebastianmukuria and others added 4 commits June 9, 2026 21:13
- New applypilot.locfilter reads both the current location.accept_patterns
  schema and the legacy keys; empty accept list now accepts anything not
  explicitly rejected (was: reject every non-remote job)
- Replace the three duplicated copies in jobspy/workday/smartextract

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Add company column to schema + _ALL_COLUMNS (migrates existing DBs)
- Populate company from jobspy/workday/smartextract discovery
- Scoring, tailoring, cover letters, apply prompt and dashboard now use
  job.company (falling back to site for legacy rows) instead of the board name

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- score_job returns None (not 0) on LLM error; parser returns None when no
  integer SCORE is found, so failures stay pending and are retried next run
- Tolerate markdown decoration ('**SCORE:** 8', 'Score: 7/10')
- Commit each score as it lands instead of one batch at the end (Ctrl+C safe)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant