Musicbrainz matching theory clarification requested

Could someone clarify how Musicbrainz is being utilized for music/audio file matching?

I installed Plex on my media library server for primarily video watching purposes and figured that my audio library could just tag along. (I tend to use Foobar for playback.) Years later, after using Plexamp for a while, I’m only just noticing that it is missing or misidentifying a lot of music content - all purchased, meticulously tagged and organized content.

I’m looking at over 4TB of audio data spread out over 76,000 files. I’ve taken a quick look at Musicbrainz Picard but my gut reaction is that it is far, far too aggressive in what and how it will change things. I’m willing to write code to update ID3 tags or create .plexmatch files if I have to but ultimately, I don’t know which tag(s) Plex is actually using to perform the matching.

One significant difficulty I see in embracing the methodology defined on the matching support page (https://support.plex.tv/articles/correcting-your-music-content-matches/) is that only super simple and small libraries can actually conform to it. The recommended folder organization does not consider the possibility of multiple versions of the same album. One could have different versions for:

  • remasters
  • physical media types
  • resolutions
  • channel counts
  • other

…each would need to be stored in a separate folder. Some form of unique identity from Musicbrainz would have to be chosen to keep them separated and it looks like there are a number of options. But which?

For instance, I think I have 7 different and distinct versions of Dark Side of the Moon. Skimming through Musicbrainz, I’m immediately drawn to the Alan Parson’s Quadraphonic mix because it’s not an official release and therefore doesn’t have a UPC/barcode. Clicking on it brings one to this page, https://musicbrainz.org/release/eb8cf4db-003f-49d4-8dbd-fbaa41da8d3d, whose URL suggests that maybe this “release” GUID embedded in the URL is the key.

  • Does Plex directly use the release GUID inferred in that URL?
    • If so, what is the proper ID3 tag name that should be added to each of the files to store this GUID?
    • If I go through the effort of adding the right GUID to 76000+ files, am I guaranteed that Plex won’t get distracted by other ID3 tags that happen to be present?
      • What I really want to know is, what is the minimal, most authoritative matching tag [set] Plex pays attention to.
    • If it doesn’t use the “release” GUID, what does it use?
  • Does the .plexmatch feature extend to Music libraries?
    • Frankly, it would be a lot easier to control matching at a folder level by dropping the right release GUID into a .plexmatch file than it would be to edit/modify 4TB of data.
  • Are there any plans to distinguish different versions of an album in any of the Plex GUIs?
    • Right now, all interfaces give only the name of the album.
    • If you had a single Hybrid SACD with both stereo and multichannel DSD streams, for example, and you ripped everything, you’d end up with a redbook CD rip, a stereo DSD file set, and a multichannel (usually 5.1) DSD file set. You have to put them in 3 different subfolders.
      • When you look at this album (group!) in any Plex interface, it’ll appear three times and all with the same name.
      • Click on one. Which is playing?
        • As far as I can tell, only the website can tell you, but only if you click on (…)->“Get Info”.
        • It matters because…
          • not all clients can downmix multichannel data
          • not all clients have the bandwidth for the higher-fidelity data
          • the content isn’t actually the same between the variants

As a bonus question, I’ve yet to actually find a combination of hardware and Plex app that supports multichannel music files:

  • Are these actually supported?
  • If not, do Music libraries support .plexignore files?

MusicBrainz Release ID ID3 tag.

Terrific questions. For a complex music library, there’s a large time investment until you automate your workflow. You can use Musicbrainz, but also have a look at MP3tag which contains very powerful scripting and community support for it.

You asked about the minimum tags to help match a track. I haven’t tested what seems likely

ARTIST=Arcade Fire
ALBUM=Everything Now
TRACKNUMBER=2
MUSICBRAINZ_ALBUMID=b5aeb5e8-8953-4e3e-9b47-5cef3240b03c

Q: Plexmatch - A: It’s only for TV shows afaik.

Q: Multiple versions ripped from an SACD for example. - A: You might have success sonic matching, but it’s a community driven database, and you may be the first one with those versions and need to add then. If they aren’t in the Musicbrainz DB, then Plex falls back to your tags and cover art.

Q: How to distinguish - In a Plex Player you’ll see an album cover with a Pencil in the lower left, used to modify the poster, custom tags, or labels. I use custom cover art, but with a large collection I’d want to add tags and / or labels.

Q: Multichannel playback. The proper option afaik is to use Plexamp and multichannel FLAC. This is well supported when output to a receiver than supports multichannel FLAC

(I was frustrated by this too before covid because I’m listening to AC3 5.1 Movies all the time. At my desktop, I can take advantage of AC3 or DTS.
So I converted my DSOTM Quad to 5.1 AC3 in .mka container.
I converted my Sam Cooke 5.1 into 5.1 AC3 in .m4a container and embedded cover art.
Both play perfectly in Plex HTPC passing the digital through to my receiver. The only problem, no gapless playback in the Pink Floyd and some quality loss. Neither of these albums had Musicbrainz tags btw.

I gotta pull the trigger on a nice receiver or streamer for those FLACs so that one day I’ll be as nice as Panda :slight_smile:

Thanks for the fast responses!

I kinda figured there wouldn’t be support for .plexmatch files. There should be. I know it’s fundamentally redundant to embedding the same data in ID3 tags but updating tags on terebytes of data would be the crazy person choice if one could instead simply drop a new flat text file in every album folder.

Manual tweaking of posters and such is too large a task and like someone else said in the thread SwiftPanda referenced, any significant Plex update or media move and all manual tweaks will be lost. It would be great if Plex would stamp the basic stats on or below the thumbnail (e.g., channel count, bitrate, bitdepth, file type).

Here’s a good example of a Hybrid SACD with 3 ripping outputs: Dire Straits’ Brothers in Arms (20th Anniversary Edition): https://musicbrainz.org/release/40e601d4-a7a8-3997-9ecf-8a1a9e5918ca
Musicbrainz clearly understands that there are 3 layers (more like 2 layers with 3 data streams, but I digress). It looks like there’s only one MBID, though, and nothing jumps out at me as offering a programmatic way to distinguish one data stream from the other. But Musicbrainz does know they’re all there. I must be missing something… So if I click on the collapsible section names, it looks like Musicbrainz categorizes the different data streams as discs based on how it updates the URL. One piece of plastic yields 3 discs, virtually.

(This is where I take issue with the Plex-recommended organizational scheme: subfolders for multiple albums or data streams are a basic necessity.)

I dread finding out how many things aren’t in Musicbrainz. Take this example:
https://musicbrainz.org/release-group/43aec0b7-6db0-4dfc-ba2b-a85a3a26cc02
I have a 3-SACD+2-BD version of this that looks like it’d take a full day just to piece together a submission. I haven’t ripped it yet so I don’t know how Plex will handle it yet.

It sounds like I need to put either Picard or MP3tag in a headlock such that I can get one of them to only add to each track the Musicbrainz-specific GUIDs for album id, disc, maybe artist, and track number for anything where I found more on a disc than Musicbrainz has catalogued. Or write some code to do it…

Thanks for the confirmation that multi-channel audio is supported. I’ll keep trying. I have an Oppo universal player is able to browse my server directly, either over samba or nfs (I forget which or if both), and play virtually anything directly with its own codecs (no ffmpeg transcoding required). The only things it can’t seem to handle are 7.1 Atmos FLAC files, but everything shy of that works, so that’s the standard I’m comparing Plex against.

I hadn’t thought to place separated-out tracks back into a container that would trick Plex into streaming multi-channel audio. Clever approach!

Thanks again for the insight!

1 Like

Thanks for the links to a couple of SACDs that we can examine.
Let’s take the first Dire Straits one. When you look at the MB web page, you land on the Overview tab but there’s undoubtedly more gory details we don’t see there. I was able to locate the pertinent data buried in the XML or JSON info you can find on the Details tab. I prefer JSON myself, and I use the free jq command line utility on my Mac to deal manipulate JSON files. Here was my workflow to sort this out:

brew install jq
curl "https://musicbrainz.org/ws/2/release/40e601d4-a7a8-3997-9ecf-8a1a9e5918ca?inc=aliases%2Bartist-credits%2Blabels%2Bdiscids%2Brecordings&fmt=json" | jq -r > sacd.json

Then I opened the sacd.json file in my favorite editor, CotEditor.
It’s on the App Store, but you could probably use Notepad on Windowz.

After a bit of a slog I found that each virtual disc has it’s own ID, like this:

      "format": "Hybrid SACD (CD layer)",
      "format-id": "1a168dbd-d7ce-381d-a38a-3948de2cb4d6",
      "position": 1,


      "format": "Hybrid SACD (SACD layer, 2 channels)",
      "position": 2,
      "format-id": "6c521ee3-f723-41de-bc6b-ec3621b9ffa9"


      "format-id": "fa46fab8-4c29-495a-b2fa-7bc14257dcca",
      "position": 3,
      "format": "Hybrid SACD (SACD layer, multichannel)",

So the interesting part I guess would be to drop those tracks into Muscbrainz and see how it names the format-id I found in JSON. It might call it MUSICBRAINZ_FORMATID or something.

Folder naming. Some slight changes you make to the folder name would not confuse the PMS scanner because you will have set the correct tags from MB. (Fingers crossed).

I think you have a lot of work like a lot of us, but the investment in time pays off with a gorgeous collection.

Oh my…what a mess.

I copied my Dire Straits SACD files to a test folder and let Picard perform a lookup. I’m not 100% certain I did the lookup correctly because it came back with a list of three albums on the right. It wasn’t clear to me whether it thought each one of those albums corresponded to each layer of the SACD, or whether those were choices I had for narrowing down the match. The third option in the list was actually the correct one. I selected it and asked it to re-save the files, letting it do what it may.

lol.

It only modified the first two tracks from the CD layer. By saving, it added a bunch of Musicbrainz metadata, and changed the Title ID3 tags to Money for Nothing (5.1 Mix) and So Far Away (5.1 Mix), respectively.

Why change only two tracks?
Why insert unnecessary characters into a title tag?
Why completely misidentify the layer?

And perhaps most puzzling, why weren’t any of the GUIDs you posted for format-id included in any of the added tags. Even if the two CD tracks were misidentified as being from the 5.1 mix, I still would’ve expected the 5.1’s format-id to show up somewhere in the new tags.

For reference, here’s the result from track 1:

Musicbrainz/Picard needs more investigation…I’ve not convinced myself yet that I can select a handful of specific tags from the database, add them to each track, and know that Picard, Plex, or anything else will instantly know what the files are. Ain’t no way this program will be allowed to touch the real archive. sheesh!

Ok, back to Plex.

I tried searching for the format-id GUIDs in the “Get XML” option under the “Get Info” option for a couple of tracks from the album. No luck but I didn’t expect a quick success because of the unlikelihood Plex had the right album match in the first place.

Next I tried a many-to-many search between the GUIDs show on one of the “Get XML” pages and the JSON data for the album. No matches, but then I realized I had the same problem - Plex likely didn’t have the right match.

I needed to eliminate the mismatch problem so I picked on something new that wouldn’t have had time to generate any new releases/variants. I picked on Cage the Elephant’s Neon Pill: https://musicbrainz.org/ws/2/release/64f5c07d-6a0d-423d-9b00-ebfb52d01d01?inc=aliases%2Bartist-credits%2Blabels%2Bdiscids%2Brecordings&fmt=json
Now it’s basically impossible not to get the right match - there’s only one CD to chose from in Musicbrainz. I tried the many-to-many GUID matching game again between the JSON data and Plex’s “Get XML” page for one of the tracks and still found no success. If Plex is tracking the Musicbrainz IDs, it must not expose them anywhere. Maybe I’d find the cross-reference in the Plex database. I guess that would make sense because it decouples them from Musicbrainz enough to switch to something else in the future, if need be.

Right now, release id (per SwitftPanda’s reference) for simple albums and perhaps format-id from your research seem to be the most promising leads. I guess I need to resort to trial and error and just try some stuff to see if Plex can do the math correctly.

Just from a fundamentals standpoint, why can’t Plex just read the ID3 tags and call it a day? (no, I’ve tried to set the “Prefer local metadata” setting but it just doesn’t work. It’s a placebo setting, right?)

MP3tag fares much better than Picard but it is struggling with the disc numbering on the Dire Straits SACD. It doesn’t indicate which tracks are in which folders on the “Adjust tab information” screen, which is part of the problem. It is more efficient about pulling the metadata down and honest at indicating what it’ll change, though.

My initial try, tweaking nothing, resulted in it switching the two SACD layers. In a second attempt, I used the Discnumber/Save options on the main interface to set the disc tags according to Musicbrainz, but it still got them switched. On a third try, I set the SACD layers’ disc tags with dpPowerAmp: MP3tag’s main interface showed things correctly, the Adjust screen may or may not have shown things correctly, and the resulting tag save still switched the SACD disc numbers.

All of the new Musicbrainz tags are the same between the three sets of files but what’s weird is that it only gave the CD set the MUSICBRAINZ_TRACKID tag - the SACD (*.dsf) files didn’t get them. It’s not quite definitive but Disc may be the the necessary piece along with MUSICBRAINZ_ALBUMID. MUSICBRAINZ_TRACKID is probably needed, too.

I’m not sure placing these revised files into Plex will prove anything one way or the other yet. I may need to select a simpler problem. I wish the Plex developers would just state what tags they’re using to match…why make customers reverse engineer everything all the time???

For what it’s worth, here’s a comparison of the 3 sets of ID3 tags (CD, SACD stereo, SACD 5.1) in order, post-MP3tag, with the disc numbers corrected:

only two tracks :slight_smile:
I’m short on time, but Ill try to follow you work over the next couple days.
You might have luck asking for clarification on the musicbrainz forum how
SACDs with multiple layers are typically handled. It may be as you found,
just saying Disc 1, Disc 2, or Disc 3. I thought the MUSICBRAINZ_TRACKID
would be enough to identify exactly where a track originated, but you said
it wasn’t on every track. lol?

You’re doing a great job. I admit I never got MP3tag into my workflow,
because I didn’t have enough physical discs that needed attention.
They have a great forum, many of whom are Plex users.

The way SACD (and Bluray and DVD-A) releases are catalogued on MB, there is no satisfying way how different versions/mixes of tracks are handled.
Ultimately, they are just tracks in a common album.
If you’re lucky, the stereo and surround mixes are on different discs – but that is absolutely not guaranteed.
Hence why you often find the track title appended with “…5.1 mix” etc.

Plex as well has no concept of Editions of music albums.
Which it would need if you wanted to browse your albums normally, or wanted to play e.g. only surround or hires editions.

The only workaround I have found is to treat the 5.1 mix and the stereo mix as separate albums.
If the CD layer of an SACD has an identical track list, I personally see no point in retaining it. The server can reduce the hires version in realtime for a remote client.

Thanks for the insight, Otto.

MB’s apparent method of using discs to differentiate data streams, sides, layers, or what-have-you is reasonable. Personally, I just use subfolders, so the only practical difference is that their system is ordinal and mine is nominal, categorically speaking. If adding someone else’s arbitrary ordering to my unordered content would magically allow Plex to properly differentiate the files, I’d adapt to it. But if Plex doesn’t match to that level of specificity, the effort would be a waste.

I have two main reasons for (in most cases) retaining all layers/stream of all things ripped (e.g., keeping the CD layer when the stereo SACD layer has identical but superior content):

  • Resolution differences versus where/how/why I’m playing the content
    • If I’m at home and listening on some sort of hi-fi equipment, I’ll usually opt for the high-resolution data.
    • In the case of SACDs, the high-resolution content is saved in DSD form (*.dsf so tagging is possible)
      • This format can be finicky, to say the least, when transcoding to PCM.
      • Not all clients support DSD/DSF playpack.
    • If I’m in the car and using Plexamp, I’ll want the CD version because it’s lower bandwidth.
      • Even more importantly, I don’t want to be distracted or confused by any surround-sound options because they definitely won’t sound right. Downmixing 5.1 or 7.1 to stereo always sounds goofy, usually being faint/quieter compared to a normal stereo or mono mix.
  • Not all seemingly identical layers are actually identical
    • I’ve seen a good number of cases where one of the layers (usually the CD layer or disc) contains bonus tracks, so throwing away the CD layer would be disadvantageous.
      • DVD-A, DVD-V, DualDiscs come to mind.
        • Now, in DVD-V and DualDisc cases where they offer two different higher-resolutions of AC3 content (e.g., 640 kbps and 448 kbps), I do ignore the lesser one (hence the “most cases” exception from before).
    • Disk space is cheap enough and the time required to grab the additional data is small enough where it’s still convenient to just make a comprehensive rip on the chance that the content could indeed be unique or useful
      • …only to find out years later that Plex and MB aren’t as detailed as you’d like them to be. :wink:

While those cover cases where one disc or one product contains multiple discs with unique or potentially unique discs, layers, data streams, etc., there is also the more common scenario of simply owning multiple copies of the same album. Considering remasters, they can sound noticeably different or contain many different tracks. How can I request for Plex to upgrade the level of detail it uses for matching content?

Leaving things as they stand today, it sounds like the only feasible approach to separating multi-channel and high-resolution content from standard resolution stereo content is to not point Plex at the main library at all: Keep it hidden and create separate hierarchies of symbolically linked folders and have Plex create separate Music Libraries based on those symlink libraries. How can I request for Plex to differentiate between standard and high-resolution content, as well as channel count?

Let’s be honest - by the time I’d finish the project, I’d be well over 100,000 files and 5TB: Instead of meticulously inching through and rewriting that many files to update ID3 tags, which will be very prone to error, How can I request for Plex to support .plexmatch files containing Musicbrainz IDs?

Regarding the Dire Straits SACD…

  • MP3Tag and Picard are forever at odds as to which SACD layer is disc 2 and which is disc 3, which is hilarious.
  • Otto’s description is confirmed: placing the Musicbrainz-updated files into Plex and refreshing yielded no useful changes. I’m sure some stuff got updated in the database but nothing user-facing changed.

Interested parties watch the forum for posts that are Tagged relating to their area of development. Whoever watches Metadata would see this thread for example.

Also Plex provides a Feature Request sub forum.

It might be as simple as making a Collection for each category you’d like to see. I also see that I have no entries for Album Format, but that’s a tag I can sort by in Plex clients.

In the days of Atmos and hires trends, differentiating seems obviously important. I can sort videos by resolution but not audio? That’s strange, but I guess the world was trained on garbage MP3s for so long that zero ■■■■■ are given.

This is a good topic for nailing down when a user doesn’t want to Scan/Tag/rewrite 100K tracks. I think it would get a decent number of votes.

But before there was a Musicbrainz catalog, record companies identified their releases by the ISRC that’s unique. Is that number present on your rips? Does it work for matching?

I don’t think I’ve ever seen the tag Album Format before. You’re saying that Plex knows what it is and that some of its client apps can do something with it? Intriguing…

Aren’t ISRC codes limited to CDs? I ripped all of my CDs with dbPowerAmp (since at least 2010) but looking at few I’m only seeing things like Catalog and AccurateRipDiscID, the latter being more prevalent. Older rips have neither, though.

I’ll poke around for the Feature Request and see what I can do to drum up some interest.

Thanks!

Ok, one feature request submitted - thanks, nibbles!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.