feat: --deep mode + WARNING floor for AAC/Vorbis transcodes (v1.2.0)#40
Merged
Merged
Conversation
…s (v1.2.0) The documented blind spot — high-bitrate AAC/Opus/Vorbis transcodes — was smaller than written. A measurement campaign (ml/measure_v4_per_codec.py, confound-controlled by ml/measure_v4_passthrough_control.py) shows the bundled v4 CNN separates these codecs from genuine FLAC on full-range audio (ROC-AUC 0.94-0.99; AAC-256 0.945). Two things kept that from reaching a verdict: Rule 12's capped +30 lands exactly on the AUTHENTIC/WARNING boundary (30), and the fast path returns before Rule 12 on exactly the silent files these fakes hide in. Both are fixed here, opt-in so the default scan stays fast. - Rule 12 high-confidence WARNING floor: when the CNN is confident (p >= 0.90) on a full-range file the heuristics left silent, lift the verdict to WARNING (never SUSPICIOUS). Calibrated (ml/calibrate_r12_threshold.py, n=240): ~72% AAC-256 / ~95% Vorbis recall for ~4% authentic cost, all WARNING, zero false SUSPICIOUS. - New --deep flag: runs Rule 12 on every file, bypassing the authentic fast path so the floor can fire. Default behaviour and speed unchanged. - Threaded deep through main -> FLACAnalyzer -> new_calculate_score -> _apply_scoring_rules. - Docs: corrected the "AAC/Opus/Vorbis near-undetectable" claim (README FAQ, technical-details), documented --deep (README, user-guide), and added the full R&D narrative + the fast-path lesson to ml/README.md. - Tests: tests/test_rule12_warning_floor.py (4) + a deep-mode bypass test. - ml/ measurement + probe scripts committed (regenerable CSV/npz outputs ignored). Proven on the real pipeline (R12_DEEP=1, full-range 36/codec): AAC-256 WARNING 1->22, Vorbis 1->29, authentic cost +2 WARNING / 0 SUSPICIOUS. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve the CodeQL alerts on the new measurement scripts: wrap the manifest read in a `with` block (file-not-always-closed) and add an explanatory comment to the best-effort temp-cleanup `except OSError: pass` (empty-except). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
The README's documented blind spot — high-bitrate AAC / Opus / Vorbis transcodes — turned out to be smaller than we'd written. A measurement campaign showed the bundled v4 CNN (Rule 12) does separate these codecs from genuine FLAC on full-range audio (ROC-AUC 0.94–0.99, AAC-256 0.945), confound-controlled (a lossless ffmpeg round-trip scores AUC 0.500 — the model ignores the pipeline, it sees the codec).
Two things stopped that ability from ever reaching a verdict:
Both are fixed here, opt-in, so the default scan stays as fast as before.
Changes
--deepflag — runs Rule 12 on every file, bypassing the authentic fast path. Slower (a decode + CNN pass per file); the only way to surface silent AAC/Vorbis fakes.ml_classifier._WARNING_FLOOR_P = 0.90) — when the CNN is confident on a full-range file the heuristics left silent, lift the verdict to WARNING (never SUSPICIOUS — the model says “look here”, it does not call the file a fake).deepthroughmain → FLACAnalyzer → new_calculate_score → _apply_scoring_rules.--deep(README, user-guide), and added the full R&D narrative + the fast-path lesson toml/README.md.tests/test_rule12_warning_floor.py(4) + a deep-mode fast-path bypass test.Calibration & proof
Operating point (
ml/calibrate_r12_threshold.py, n=240, full-range):Proven on the real pipeline (
R12_DEEP=1, full-range, 36 sources/codec, WARNING before → after):Authentic cost: +2 WARNING, 0 SUSPICIOUS.
Limits (kept honest)
--deepdoes not solve.Checks
black/isort/flake8 src tests/mypy srcall clean; rule-12, deep-mode, scoring and verdict tests pass. Backward compatible — without--deep, behaviour and speed are unchanged.🤖 Generated with Claude Code