Skip to content

fix(formats): ALAC files with cover art were misrouted as lossy (v0.16.1)#29

Merged
Guillain-RDCDE merged 1 commit into
mainfrom
fix/v0.16.1-alac-routing
Jun 2, 2026
Merged

fix(formats): ALAC files with cover art were misrouted as lossy (v0.16.1)#29
Guillain-RDCDE merged 1 commit into
mainfrom
fix/v0.16.1-alac-routing

Conversation

@Guillain-RDCDE
Copy link
Copy Markdown
Owner

v0.16.1 — Correctif de routage ALAC (quirk ffprobe pochette)

Une passe de validation terrain sur une vraie bibliothèque de 72k fichiers a révélé un bug que les fixtures synthétiques ne pouvaient pas voir : ~10 albums ALAC authentiques étaient silencieusement rejetés comme lossy.

Cause

Sur les .m4a ALAC avec pochette embarquée, ffprobe -of csv=p=0 renvoie le codec sous la forme alac, (champ vide en trop + \r Windows). probe_codec ne faisait qu'un .strip() des espaces → "alac,"LOSSLESS_CODECSis_analysable_lossless renvoyait False → fichier routé vers le rejet « remplace par un vrai FLAC » au lieu d'être analysé.

Correctif

  • probe_codec normalise : première ligne, premier token séparé par virgule, minuscules.
  • Test de régression (test_probe_codec_strips_trailing_comma_and_cr, monkeypatch "alac,\r\n""alac" → route vers analyse).
  • ml/field_validation.py conservé comme artefact R&D (chemin biblio via argv / env FLAC_LIBRARY).

Validation après correctif (réel)

  • Routage clean : 77 AAC rejetés, 20 ALAC + 15 APE analysés, 0 mismatch.
  • 20 ALAC réels → tous AUTHENTIC.
  • 3/3 fakes MP3→ALAC flaggés.
  • 0 crash sur 112 m4a/ape + 120 FLAC (dont 2 FLAC corrompus gérés proprement).

Checks gatés verts en local : black, isort, flake8=0, mypy clean, tests OK.

🤖 Generated with Claude Code

…6.1)

Field validation on a real 72k-file library found a bug synthetic fixtures missed:
~10 genuine ALAC albums were silently routed to the reject list.

Cause: on ALAC .m4a embedding cover art, `ffprobe -of csv=p=0` returns the codec
as `alac,` (trailing empty field + Windows CR). probe_codec only .strip()-ped
whitespace, so "alac," wasn't in LOSSLESS_CODECS -> is_analysable_lossless False
-> the file was rejected as "replace with a real FLAC" instead of analysed.

- probe_codec normalises: first line, first comma-separated token, lower-cased.
- Regression test (monkeypatched "alac,\r\n" -> "alac", routes to analysis).
- ml/field_validation.py: the one-off harness kept as an R&D artifact
  (library path via argv / FLAC_LIBRARY env).

Validation after fix: routing clean (77 AAC rejected, 20 ALAC + 15 APE analysed,
0 mismatch); 20 real ALAC all AUTHENTIC; 3/3 MP3->ALAC fakes flagged; 0 crashes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Guillain-RDCDE Guillain-RDCDE merged commit 2413883 into main Jun 2, 2026
15 checks passed
@Guillain-RDCDE Guillain-RDCDE deleted the fix/v0.16.1-alac-routing branch June 2, 2026 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant