Skip to content

feat: --deep mode + WARNING floor for AAC/Vorbis transcodes (v1.2.0)#40

Merged
Guillain-RDCDE merged 2 commits into
mainfrom
feat/deep-mode-aac-vorbis-warning-floor
Jun 4, 2026
Merged

feat: --deep mode + WARNING floor for AAC/Vorbis transcodes (v1.2.0)#40
Guillain-RDCDE merged 2 commits into
mainfrom
feat/deep-mode-aac-vorbis-warning-floor

Conversation

@Guillain-RDCDE
Copy link
Copy Markdown
Owner

What & why

The README's documented blind spot — high-bitrate AAC / Opus / Vorbis transcodes — turned out to be smaller than we'd written. A measurement campaign showed the bundled v4 CNN (Rule 12) does separate these codecs from genuine FLAC on full-range audio (ROC-AUC 0.94–0.99, AAC-256 0.945), confound-controlled (a lossless ffmpeg round-trip scores AUC 0.500 — the model ignores the pipeline, it sees the codec).

Two things stopped that ability from ever reaching a verdict:

  1. Off by one point. Rule 12 is capped at +30, and a full-range AAC/Vorbis transcode earns ~0 from the heuristics — so +30 lands exactly on 30, the top of AUTHENTIC (WARNING starts at 31). A maximally-confident detection couldn't even reach WARNING.
  2. The fast path skips Rule 12. To keep big scans fast, the calculator short-circuits on files the heuristics clear (score < 10, no MP3) and returns before Rule 12 — exactly the silent-heuristic profile of these fakes.

Both are fixed here, opt-in, so the default scan stays as fast as before.

Changes

  • --deep flag — runs Rule 12 on every file, bypassing the authentic fast path. Slower (a decode + CNN pass per file); the only way to surface silent AAC/Vorbis fakes.
  • High-confidence WARNING floor (ml_classifier._WARNING_FLOOR_P = 0.90) — when the CNN is confident on a full-range file the heuristics left silent, lift the verdict to WARNING (never SUSPICIOUS — the model says “look here”, it does not call the file a fake).
  • Threaded deep through main → FLACAnalyzer → new_calculate_score → _apply_scoring_rules.
  • Docs (newbies + pros): corrected the “near-undetectable” claim (README FAQ, technical-details), documented --deep (README, user-guide), and added the full R&D narrative + the fast-path lesson to ml/README.md.
  • Tests: tests/test_rule12_warning_floor.py (4) + a deep-mode fast-path bypass test.
  • ml/ probe + measurement scripts committed (regenerable CSV/npz outputs gitignored).

Calibration & proof

Operating point (ml/calibrate_r12_threshold.py, n=240, full-range):

p ≥ authentic FP aac_256 recall vorbis recall
0.90 3.8 % 71.7 % 94.6 %
0.95 2.9 % 60.0 % 84.6 %

Proven on the real pipeline (R12_DEEP=1, full-range, 36 sources/codec, WARNING before → after):

codec WARNING before WARNING after
aac_256 1 22
vorbis_q5 1 29
opus_128 1 21

Authentic cost: +2 WARNING, 0 SUSPICIOUS.

Limits (kept honest)

  • Full-range only. Band-limited material (baroque, 1920s, solo acoustic) remains a fundamental signal limit --deep does not solve.
  • The WARNING floor is a real specificity↔recall trade (the authentic FP floor is ~3 % even at p ≥ 0.95) — which is why it stops at WARNING, and why it's opt-in.

Checks

black / isort / flake8 src tests / mypy src all clean; rule-12, deep-mode, scoring and verdict tests pass. Backward compatible — without --deep, behaviour and speed are unchanged.

🤖 Generated with Claude Code

…s (v1.2.0)

The documented blind spot — high-bitrate AAC/Opus/Vorbis transcodes — was
smaller than written. A measurement campaign (ml/measure_v4_per_codec.py,
confound-controlled by ml/measure_v4_passthrough_control.py) shows the bundled
v4 CNN separates these codecs from genuine FLAC on full-range audio
(ROC-AUC 0.94-0.99; AAC-256 0.945). Two things kept that from reaching a verdict:
Rule 12's capped +30 lands exactly on the AUTHENTIC/WARNING boundary (30), and
the fast path returns before Rule 12 on exactly the silent files these fakes hide
in. Both are fixed here, opt-in so the default scan stays fast.

- Rule 12 high-confidence WARNING floor: when the CNN is confident (p >= 0.90) on
  a full-range file the heuristics left silent, lift the verdict to WARNING (never
  SUSPICIOUS). Calibrated (ml/calibrate_r12_threshold.py, n=240): ~72% AAC-256 /
  ~95% Vorbis recall for ~4% authentic cost, all WARNING, zero false SUSPICIOUS.
- New --deep flag: runs Rule 12 on every file, bypassing the authentic fast path
  so the floor can fire. Default behaviour and speed unchanged.
- Threaded deep through main -> FLACAnalyzer -> new_calculate_score ->
  _apply_scoring_rules.
- Docs: corrected the "AAC/Opus/Vorbis near-undetectable" claim (README FAQ,
  technical-details), documented --deep (README, user-guide), and added the full
  R&D narrative + the fast-path lesson to ml/README.md.
- Tests: tests/test_rule12_warning_floor.py (4) + a deep-mode bypass test.
- ml/ measurement + probe scripts committed (regenerable CSV/npz outputs ignored).

Proven on the real pipeline (R12_DEEP=1, full-range 36/codec): AAC-256 WARNING
1->22, Vorbis 1->29, authentic cost +2 WARNING / 0 SUSPICIOUS.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread ml/aac_stereo_probe_features.py Fixed
Comment thread ml/calibrate_r12_threshold.py Fixed
Comment thread ml/measure_r12_verdict_fullrange.py Fixed
Comment thread ml/measure_v4_passthrough_control.py Fixed
Comment thread ml/measure_v4_per_codec.py Fixed
Comment thread ml/aac_stereo_probe_features.py Fixed
Comment thread ml/calibrate_r12_threshold.py Fixed
Comment thread ml/measure_r12_verdict_fullrange.py Fixed
Comment thread ml/measure_v4_passthrough_control.py Fixed
Comment thread ml/measure_v4_per_codec.py Fixed
Resolve the CodeQL alerts on the new measurement scripts: wrap the manifest
read in a `with` block (file-not-always-closed) and add an explanatory comment
to the best-effort temp-cleanup `except OSError: pass` (empty-except).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Guillain-RDCDE Guillain-RDCDE merged commit 94a364d into main Jun 4, 2026
18 checks passed
@Guillain-RDCDE Guillain-RDCDE deleted the feat/deep-mode-aac-vorbis-warning-floor branch June 4, 2026 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants