Skip to content

Add album_artist, year, rating, file_path column to data layer across all providers#306

Open
rendyhd wants to merge 18 commits intoNeptuneHub:mainfrom
rendyhd:feature/album-artist-support
Open

Add album_artist, year, rating, file_path column to data layer across all providers#306
rendyhd wants to merge 18 commits intoNeptuneHub:mainfrom
rendyhd:feature/album-artist-support

Conversation

@rendyhd
Copy link
Contributor

@rendyhd rendyhd commented Jan 31, 2026

Capture the original album artist before _select_best_artist() overwrites it with the track-level artist. This preserves the album-level artist (e.g. for compilation albums) in a new album_artist column in the score table, propagated through all media server modules (Jellyfin, Emby, Navidrome, Lyrion), the analysis pipeline, similarity search, song alchemy, path manager, and API responses.

Also fix a pre-existing bug in Emby's standalone track path where _select_best_artist() return tuple was not unpacked.

Also adds:

  • Year

  • Track Rating (for Navidrome and Lyrion)

  • File Path (note on Navidrome)*

  • Updated chat (instant playlist) with info in year and rating


*Navidrome by defaults reports an "Internal path", this is artist/album/track but not your real file path. If you want to match files across services, it's advised to turn on Report Real Path:

Log into the Navidrome web interface
Go to Players in the right sidebar
Click on the AudioMuse player entry (it appears after AudioMuse first connects)
Toggle "Report Real Path" to enabled

You can also change this as a default setting by setting ND_DEFAULTREPORTREALPATH=true

Capture the original album artist before _select_best_artist() overwrites
it with the track-level artist. This preserves the album-level artist
(e.g. for compilation albums) in a new `album_artist` column in the
score table, propagated through all media server modules (Jellyfin,
Emby, Navidrome, Lyrion, MPD), the analysis pipeline, similarity search,
song alchemy, path manager, and API responses.

Also fix a pre-existing bug in Emby's standalone track path where
_select_best_artist() return tuple was not unpacked.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@NeptuneHub
Copy link
Owner

NeptuneHub commented Feb 1, 2026

Start collecting additional information could be useful as a basis of future functionality. Did you have something in mind?
I'm still thinking that we miss a similar album functionality but I still don't have idea on how to implement it without being repetitive.

About mediaserver to test, just one correction, you need to integrate only Jellyfin, Emby, Navidrome, Lyrion.

IMPORTANT: Navidrome is the configuration for all the Music Server that support the Open Subsonic standard. Ok maybe cover all is not always possible (someone have some small difference) but I usually tested this kind of change on Navidrome AND Lightweight Music Server

For MPD I tried add integration but I was never able to complete basically because I wasn't able to directly download song from MPD for the analysis (maybe if you have any idea integrate MPD or other Music Server in future could be nice). So you can skip it because is incomplete.

@rendyhd
Copy link
Contributor Author

rendyhd commented Feb 1, 2026

My goal is to make a Curated Playlist Builder, this would be great metadata to have in there.
The local filepath is to expand in the future and i figured - while I'm expanding the meta data why not include it.

I was thinking, when you have local filepath you can:

  • Run analysis directly on local files and generate .m3u based on the existing features
  • Open the doors to integrate with providers that don't allow downloads (e.g. plex)
  • Connect to multiple providers, using filepath as key identifier

Regarding Album, there's two paths I think:

  • Expand album
  • Find similar album

The expand album is very much in line for what I want with the second part of my Playlist system - the Expand playlist. You basically expand from N tracks, taking not just the last song as data-point but the group. This would be the same for playlist and album expansion.
A potential endpoint could be "keep playing" instead of track radio. Taking the last N tracks, in use cases where people queue a couple to fine-tune a mood.

Similar album is a different approach, matches groups with groups. I wonder how much that would get used. It'd say it's fun from an exploration perspective if you have a large library, but not the best way to stay in a certain mood.

@NeptuneHub
Copy link
Owner

NeptuneHub commented Feb 1, 2026

Local file path need to be think carefully:

  • on one side it can speed up the analysis and open the door to other functionaltiy;
  • on the other side you need to think about then how then you use the analysis against the single music server

On this last point it could be nice the idea of a .m3u output for music server that support it. So you don't have to use API at all (and you don't have to match a file path with the ID of the music server). It maybe could be impelmented as a "local-path" music server. The result should be you analyze locally, your output is local file, and then you can import where you want.

claude and others added 9 commits February 1, 2026 12:28
Two Docker Compose files for end-to-end testing of the album_artist
column across all providers (Jellyfin, Emby, Navidrome, Lyrion — MPD excluded):
- Providers stack with shared test_music mount
- Per-provider NVIDIA AudioMuse instances with isolated Redis/Postgres
- Bash validation script that queries each Postgres for album_artist data
- Step-by-step test guide covering provider setup, API keys, and checklist

https://claude.ai/code/session_01AU49aWqCYybatiX1yhK6UD
Build the image from the repo Dockerfile with the nvidia/cuda base
instead of pulling from the registry. The flask-jellyfin service
owns the build; all other services reuse audiomuse-ai:test-nvidia
with pull_policy: never.

https://claude.ai/code/session_01AU49aWqCYybatiX1yhK6UD
Replaces named Docker volumes with host bind mounts so provider
config and data persist in testing/providers/{jellyfin,emby,
navidrome,lyrion}/. Added testing/.gitignore to exclude the
providers/ directory and .env.test from version control.

https://claude.ai/code/session_01AU49aWqCYybatiX1yhK6UD
@rendyhd
Copy link
Contributor Author

rendyhd commented Feb 1, 2026

Successfully tested. I'm going to look into the other metadata fields before pushing the PR.

  • Added album_artist in the score table, type text, nullable.
  • Ran all providers and 4 AudioMuse instances, after API fix now all 4 have 100% album_artist filled.
  • I've also added a folder called "testing". This has 2 docker compose stacks (one for providers, one for AudioMuse instances), a common .env, and a guide how to set it up.

@rendyhd
Copy link
Contributor Author

rendyhd commented Feb 2, 2026

I've added and tested the following fields:

  • Year (works on all 4)

  • File Path (works on all 4, however, Navidrome needs a setting changed)*

  • Track Rating (only added for Navidrome and Lyrion, as Jellyfin and Emby don't support it)

  • Album Artist was already working on all 4

*Navidrome by defaults reports an "Internal path", this is artist/album/track but not your real file path. If you want to match files across services, it's advised to turn on Report Real Path:

  1. Log into the Navidrome web interface
  2. Go to Players in the right sidebar
  3. Click on the AudioMuse player entry (it appears after AudioMuse first connects)
  4. Toggle "Report Real Path" to enabled

You can also change this as a default setting by setting ND_DEFAULTREPORTREALPATH=true

…emain consistent between version (and not having to re-enable settings)
@rendyhd rendyhd marked this pull request as ready for review February 2, 2026 12:24
@rendyhd rendyhd changed the title Add album_artist column to data layer across all providers Add album_artist, year, rating, file_path column to data layer across all providers Feb 2, 2026
@NeptuneHub
Copy link
Owner

Thanks for you effort, to recap you added this 4 field all in the score table:

  • Year (works on all 4) => numeric?
  • File Path (works on all 4, however, Navidrome needs a setting changed) => String?
  • Track Rating (only added for Navidrome and Lyrion, as Jellyfin and Emby don't support it) => Numeric?
  • Album Artist was already working on all 4 => String?

And for now you use only album artist as as fallback when album is not present in the difference "song search" form? All the other field are only for future implementation right?

In your test, on all the mediaserver, did you try:

  • legacy database, already with multiple album in it => it have a migration functionality to add the missing column? you populate them at the first analysis also for the already analyzed song WITHOUT havign to reanalyze with the ML model? (i mean just populate the missing field)
  • New deployment, first run => it just create the new table and start populating?
  • For music serverthat doesn't support one or more of this field, which is the fallback? maybe some default value that could avoid to future implementation to broke on unsupported music server? (like "uknown" for the string and the number 0 for the number?)
  • Which edge case did you tested? => please share the list of all the test that you did automated or manually, and the result.

I'll do my test offcourse, but editing the mediaserver part we need extra attention.

@arsaboo
Copy link

arsaboo commented Feb 3, 2026

@rendyhd This is great. Having the ids along with the ratings is huge....

@rendyhd
Copy link
Contributor Author

rendyhd commented Feb 3, 2026

@NeptuneHub

  • Yes, I've added it the 4 to score. I considered a separate table, but don't expect any performance issues.
    Year = INT
    File Path = TEXT
    Rating = INT
    Album Artist = TEXT

  • album artist, rating, and year are included in the system prompt for the instant playlist

In your test, on all the mediaserver, did you try:
legacy database, already with multiple album in it => it have a migration functionality to add the missing column? you populate them at the first analysis also for the already analyzed song WITHOUT havign to reanalyze with the ML model? (i mean just populate the missing field) - Yes, same as album name
New deployment, first run => it just create the new table and start populating? Yes
For music serverthat doesn't support one or more of this field, which is the fallback? maybe some default value that could avoid to future implementation to broke on unsupported music server? (like "uknown" for the string and the number 0 for the number?) In the table it's just NULL, it doesn't get used elsewhere
Which edge case did you tested? => please share the list of all the test that you did automated or manually, and the result.
Tested load will all servers, with and without ratings.


  • I just added additional logic for when someone has a full date instead of a year field in the year tag. This hasn't been an issue yet, because the providers have always given a YYYY for me. I fixed it for future local path implementation and knowing sometimes the mp3tag can be filled with a full date depending on where you load the metadata from.

  • I just changed the rating from 0-100 to 0-5 since most users will be more used to the 5 star rating.

During my test just now I did find out that album name isn't populated for Navidrome and Lyrion - I haven't seen anything that could cause that regression. I can test tomorrow with the current release

@rendyhd
Copy link
Contributor Author

rendyhd commented Feb 4, 2026

Check after last commit, album name is complete now too (rating is just a single song, so 0.3% is correct):
Complete Completion Overview:

Field Jellyfin Emby Navidrome Lyrion
Total Rows 308 308 308 308
title 100.0% 100.0% 100.0% 100.0%
author 100.0% 100.0% 100.0% 100.0%
album 100.0% 100.0% 100.0% 100.0%
album_artist 100.0% 100.0% 100.0% 100.0%
tempo 100.0% 100.0% 100.0% 100.0%
key 100.0% 100.0% 100.0% 100.0%
scale 100.0% 100.0% 100.0% 100.0%
mood_vector 100.0% 100.0% 100.0% 100.0%
energy 100.0% 100.0% 100.0% 100.0%
year 100.0% 100.0% 100.0% 100.0%
rating 0.0% 0.0% 0.3% 0.3%
file_path 100.0% 100.0% 100.0% 100.0%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants