fix(formats): ALAC files with cover art were misrouted as lossy (v0.16.1)#29
Merged
Conversation
…6.1) Field validation on a real 72k-file library found a bug synthetic fixtures missed: ~10 genuine ALAC albums were silently routed to the reject list. Cause: on ALAC .m4a embedding cover art, `ffprobe -of csv=p=0` returns the codec as `alac,` (trailing empty field + Windows CR). probe_codec only .strip()-ped whitespace, so "alac," wasn't in LOSSLESS_CODECS -> is_analysable_lossless False -> the file was rejected as "replace with a real FLAC" instead of analysed. - probe_codec normalises: first line, first comma-separated token, lower-cased. - Regression test (monkeypatched "alac,\r\n" -> "alac", routes to analysis). - ml/field_validation.py: the one-off harness kept as an R&D artifact (library path via argv / FLAC_LIBRARY env). Validation after fix: routing clean (77 AAC rejected, 20 ALAC + 15 APE analysed, 0 mismatch); 20 real ALAC all AUTHENTIC; 3/3 MP3->ALAC fakes flagged; 0 crashes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v0.16.1 — Correctif de routage ALAC (quirk ffprobe pochette)
Une passe de validation terrain sur une vraie bibliothèque de 72k fichiers a révélé un bug que les fixtures synthétiques ne pouvaient pas voir : ~10 albums ALAC authentiques étaient silencieusement rejetés comme lossy.
Cause
Sur les
.m4aALAC avec pochette embarquée,ffprobe -of csv=p=0renvoie le codec sous la formealac,(champ vide en trop +\rWindows).probe_codecne faisait qu'un.strip()des espaces →"alac,"∉LOSSLESS_CODECS→is_analysable_losslessrenvoyait False → fichier routé vers le rejet « remplace par un vrai FLAC » au lieu d'être analysé.Correctif
probe_codecnormalise : première ligne, premier token séparé par virgule, minuscules.test_probe_codec_strips_trailing_comma_and_cr, monkeypatch"alac,\r\n"→"alac"→ route vers analyse).ml/field_validation.pyconservé comme artefact R&D (chemin biblio via argv / envFLAC_LIBRARY).Validation après correctif (réel)
Checks gatés verts en local : black, isort, flake8=0, mypy clean, tests OK.
🤖 Generated with Claude Code