Following on from issues Show "Sunny (2024)" getting incorrect metadata and has multiple tmdb ids and Correct Movie Match but incorrect Tags and Crew (Plex Agent) I decided to see if I can do a validation and check for duplicate plex guids for tmdb, tvdb and imdb for movies, shows and episodes.
There maybe better ways to do but this worked for me. I used the metadata_items table from the plex database to get ids in my library for plex://movie, plex://episode and plex://show and copied them to a text file for each type plex-media-ids-movies.txt, plex-media-ids-series.txt and plex-media-ids-episodes.txt. I have 4287 movies, 998 shows with 38671 episodes.
Using these text files as input I then used this simple script (plex-export-xml.sh) to export the xml data for each and every media item.
#!/bin/bash
while read plexmedata; do
curl "http://192.168.1.3:32400/library/metadata/$plexmedata" > $plexmedata.xml
done
./plex-export-xml.sh < /home/plex/plex-media-ids-movies.txt
./plex-export-xml.sh < /home/plex/plex-media-ids-series.txt
./plex-export-xml.sh < /home/plex/plex-media-ids-episodes.txt
After which I end up with a xml file for each item I can then search these xml files for any double entries - basically like this …
find /home/plex/plex-media-xmls/movies/ -type f -exec grep -wc "tmdb:" {} + | grep :2
find /home/plex/plex-media-xmls/series/ -type f -exec grep -wc "tmdb:" {} + | grep :2
find /home/plex/plex-media-xmls/episodes/ -type f -exec grep -wc "tmdb:" {} + | grep :2
The results are better than I feared. Movies Alvin and the Chipmunks (2007) — The Movie Database (TMDB) <Guid id="tmdb://398593"/> <Guid id="tmdb://6477"/> and Batman: The Dark Knight Returns, Part 2 (2013) — The Movie Database (TMDB) <Guid id="tmdb://142061"/> <Guid id="tmdb://472027"/> have duplicate ids for tmdb.
Only one show Sunny (TV Series 2024- ) — The Movie Database (TMDB) <Guid id="tmdb://105298"/> <Guid id="tmdb://157226"/> had duplicate ids for tmdb and thankfully no episodes have any duplicates.
Now, if we extend this to tvdb duplicates its we find a bit more:
Movies:
In This Corner of the World → In This Corner of the World (2016) — The Movie Database (TMDB)
<Guid id="tvdb://309901"/>
<Guid id="tvdb://9534"/>
Batman: The Dark Knight Returns Part 2 → Batman: The Dark Knight Returns, Part 2 (2013) — The Movie Database (TMDB)
<Guid id="tvdb://2113"/>
<Guid id="tvdb://292129"/>
Star Wars: Episode I - The Phantom Menace → Star Wars: Episode I - The Phantom Menace (1999) — The Movie Database (TMDB)
<Guid id="tvdb://134347"/>
<Guid id="tvdb://334"/>
Three Identical Strangers → Three Identical Strangers (2018) — The Movie Database (TMDB)
<Guid id="tvdb://1536"/>
<Guid id="tvdb://343246"/>
Exorcist III → The Exorcist III (1990) — The Movie Database (TMDB)
<Guid id="tvdb://35907"/>
<Guid id="tvdb://3931"/>
Tagging @drzoidberg33 @OttoKerner @SwiftPanda16 as they commented on the previous threads regarding this topic. I’m hoping plex can take this data and fix these identified issues above. Maybe if they have a large db with even more movies and shows they can run the method to detect more possible errors.