TV Series name matching research

@cjezell said:
Here is how the files are named:

“\ds1512\video\Television\Star Wars The Clone Wars\Season 1\Star Wars The Clone Wars - S01E10 - Lair of Grievouos.mp4”

In the case of MP4/M4V files - if they have embedded and non-compliant names in the Title Fields they will not match (or misbehave badly) if Local Media Assets is at the top of your Agent Lists in TV Show Tabs. Drag Local Media Assets to the bottom of the Agents Lists, but do leave it checked. Just demote it.
https://support.plex.tv/hc/en-us/articles/200241558-Agents

This may not be your issue, but it’s worth mentioning and is something you can check.

@JuiceWSA

I had forgotten to write that in my post… I was updating to include that as you posted. Good catch and thank you.

Would you please show the files and structure as you have them (the block)?
As in a screenshot of the folder?

Also, is there any metadata tagging inside those MP4 files?
Not unless Handbrake has changed something in the way it rips DVD’s and Blu-Rays. My workflow is to rename the file after handbrake finishes, then move into the appropriate folder on my NAS and allow Plex to scan for changes. That’s all I’ve ever done.

@cjezell

If you use Linux / OS X, then a terminal window ls -la and a pwd is great.
In Windows, I guess a screenshot in explorer, is good enough.

OK, I’m at my wits end with this. Season 1 of Heroes Reborn is doing the SAME THING. Episodes 10 & 11 are missing. I am including the screenshots from what I see in PMS, Windows Explorer and the Synology File Center. I will also attach the logs AND my database download. I don’t know what else to do at this point. I am beyond frustrated.

Possible database corruption issue? I’m going to post the interesting part of the scanner log section concerning episodes 10 and 11.

Jan 24, 2017 20:08:25.872 [0xf3f54780] DEBUG - Path matched, we’re reusing media item 17499
Jan 24, 2017 20:08:25.874 [0xf3f54780] DEBUG - * Scanning Heroes Reborn Season 1 Episode 10
Jan 24, 2017 20:08:25.874 [0xf3f54780] DEBUG - Looking for path match for [/volume1/video/Television/Heroes Reborn/Season 1/Heroes Reborn.S01E10.1153 to Odessa.mp4]
Jan 24, 2017 20:08:25.876 [0xf3f54780] DEBUG - Skipping hash check, no size match for 1257065546 bytes.
Jan 24, 2017 20:08:25.876 [0xf3f54780] DEBUG - No match for hash.
Jan 24, 2017 20:08:25.879 [0xf3f54780] DEBUG - Checking descendants of Heroes Reborn
Jan 24, 2017 20:08:25.882 [0xf3f54780] DEBUG - → Searching down into Heroes Reborn/Season 0
Jan 24, 2017 20:08:25.882 [0xf3f54780] DEBUG - Checking descendants of Heroes Reborn/Season 0
Jan 24, 2017 20:08:25.888 [0xf3f54780] DEBUG - → FOUND metadata item (show)
Jan 24, 2017 20:08:25.888 [0xf3f54780] DEBUG - → We found a local media item with rooted metadata in Heroes Reborn/Season 0
Jan 24, 2017 20:08:25.888 [0xf3f54780] DEBUG - Found existing show 2212
Jan 24, 2017 20:08:25.893 [0xf3f54780] DEBUG - Downloading document http://127.0.0.1:32400/library/changestamp
Jan 24, 2017 20:08:25.894 [0xf3f54780] DEBUG - HTTP requesting GET http://127.0.0.1:32400/library/changestamp
Jan 24, 2017 20:08:25.900 [0xf3f54780] DEBUG - HTTP 200 response from GET http://127.0.0.1:32400/library/changestamp
Jan 24, 2017 20:08:25.904 [0xf3f54780] ERROR - SQLITE3:0x845bb84, 11, database corruption at line 62365 of [fc49f556e4]
Jan 24, 2017 20:08:25.904 [0xf3f54780] ERROR - SQLITE3:0x845bb84, 11, statement aborts at 115: [insert into metadata_items (library_section_id,parent_id,metadata_type,guid,hash,media_item_count,title,title_sort,original_title,studio,rating,audience_rating,rating_count,tagline,su
Jan 24, 2017 20:08:25.906 [0xf3f54780] ERROR - Exception inside transaction (inside=1) (…/Library/MetadataItem.cpp:869): sqlite3_statement_backend::loadOne: database disk image is malformed
Jan 24, 2017 20:08:25.906 [0xf3f54780] ERROR - Exception inside transaction (inside=1) (…/Library/Episode.cpp:187): sqlite3_statement_backend::loadOne: database disk image is malformed
Jan 24, 2017 20:08:25.907 [0xf3f54780] ERROR - Exception inside transaction (inside=1) (…/Library/MetadataItem.cpp:2682): sqlite3_statement_backend::loadOne: database disk image is malformed
Jan 24, 2017 20:08:25.907 [0xf3f54780] ERROR - Exception assimilating media item in Heroes Reborn/Season 1: sqlite3_statement_backend::loadOne: database disk image is malformed
Jan 24, 2017 20:08:25.907 [0xf3f54780] DEBUG - * Scanning Heroes Reborn Season 1 Episode 11
Jan 24, 2017 20:08:25.908 [0xf3f54780] DEBUG - Looking for path match for [/volume1/video/Television/Heroes Reborn/Season 1/Heroes Reborn.S01E11.Send in the Clones.mp4]
Jan 24, 2017 20:08:25.909 [0xf3f54780] DEBUG - Skipping hash check, no size match for 1430653318 bytes.
Jan 24, 2017 20:08:25.909 [0xf3f54780] DEBUG - No match for hash.
Jan 24, 2017 20:08:25.913 [0xf3f54780] DEBUG - Checking descendants of Heroes Reborn
Jan 24, 2017 20:08:25.915 [0xf3f54780] DEBUG - → Searching down into Heroes Reborn/Season 0
Jan 24, 2017 20:08:25.915 [0xf3f54780] DEBUG - Checking descendants of Heroes Reborn/Season 0
Jan 24, 2017 20:08:25.921 [0xf3f54780] DEBUG - → FOUND metadata item (show)
Jan 24, 2017 20:08:25.921 [0xf3f54780] DEBUG - → We found a local media item with rooted metadata in Heroes Reborn/Season 0
Jan 24, 2017 20:08:25.921 [0xf3f54780] DEBUG - Found existing show 2212
Jan 24, 2017 20:08:25.925 [0xf3f54780] DEBUG - Downloading document http://127.0.0.1:32400/library/changestamp
Jan 24, 2017 20:08:25.926 [0xf3f54780] DEBUG - HTTP requesting GET http://127.0.0.1:32400/library/changestamp
Jan 24, 2017 20:08:25.931 [0xf3f54780] DEBUG - HTTP 200 response from GET http://127.0.0.1:32400/library/changestamp
Jan 24, 2017 20:08:25.934 [0xf3f54780] ERROR - SQLITE3:0x845bb84, 11, database corruption at line 73878 of [fc49f556e4]
Jan 24, 2017 20:08:25.934 [0xf3f54780] ERROR - SQLITE3:0x845bb84, 11, statement aborts at 115: [insert into metadata_items (library_section_id,parent_id,metadata_type,guid,hash,media_item_count,title,title_sort,original_title,studio,rating,audience_rating,rating_count,tagline,su
Jan 24, 2017 20:08:25.935 [0xf3f54780] ERROR - Exception inside transaction (inside=1) (…/Library/MetadataItem.cpp:869): sqlite3_statement_backend::loadOne: database disk image is malformed
Jan 24, 2017 20:08:25.935 [0xf3f54780] ERROR - Exception inside transaction (inside=1) (…/Library/Episode.cpp:187): sqlite3_statement_backend::loadOne: database disk image is malformed
Jan 24, 2017 20:08:25.935 [0xf3f54780] ERROR - Exception inside transaction (inside=1) (…/Library/MetadataItem.cpp:2682): sqlite3_statement_backend::loadOne: database disk image is malformed
Jan 24, 2017 20:08:25.935 [0xf3f54780] ERROR - Exception assimilating media item in Heroes Reborn/Season 1: sqlite3_statement_backend::loadOne: database disk image is malformed

Is there a database repair utility in PMS?

Here is the procedure: https://support.plex.tv/hc/en-us/articles/201100678-Repair-a-Corrupt-Database

One thing I noticed is that when I put the file in the folder, Plex is not adding it automatically, just looking for the file to force the scan

I have created a parallel thread to this thread.

The purpose of the new thread is to address specific technical issues / needs to work problems discovered/encountered while we continue primary research & data collection here.

It is here. https://forums.plex.tv/discussion/255585/tv-series-name-matching-research-technical-issues

I ask we address & resolve technical issues, as they related to this, in that thread, and keep our discovery process / new naming issue additions here.

Hi there, there’s an issue I posted about a year ago with no results, and if you’re still looking for data (I’m aware of how old this thread is) then here it is :).

The episode numbers of multi-part episodes in the “DVD Order” entries of some shows in TheTVDB have numbers after their decimal points. One example is here in the 3rd season of Seinfeld, episodes 15.1 and 15.2:
https://www.thetvdb.com/index.php?lid=7&seasonid=16114&seriesid=79169&tab=season&order=dvd

There seems to be no way of forcing Plex to scrape that data at all. Efforts are documented here:

If Plex could be updated to be able to read such numbers, it would be super appreciated :). The three shows I’ve come across where DVD order is the preferred order AND they have decimals like this are Seinfeld, Animaniacs, and PAW Patrol. For the sake of completeness I’ll also mention that I asked TheTVDB if they could re-number the episodes without decimals and this was their response:

“This is a problem you will have to take up with PLEX using decimals to combine episodes in the DVD order has long been a published part of our API”

@svensayshi

Using the different formats: SxE and SxxExx, please write up and show how those would work. If there are restrictions on which delimiters are allowed, please state so… I will submit it to Engineering as a RFE

The simple solution would be for Plex to recognize a decimal followed by a single digit as a legitimate part of episode numbers. So (as in the Seinfeld example) that if someone enters “S03E15.1” or “3x15.1” it will simply attempt to match that number with TheTVDB (decimal included) as per the link:
https://www.thetvdb.com/index.php?lid=7&seasonid=16114&seriesid=79169&tab=season&order=dvd

If it’s relevant, this should only ever be needed for DVD order.

And, for the case where multiple episodes are contained in one file, as long as Plex can read “decimal+digit” as part of an episode number, one could use “S03E15.1-E15.2” to indicate a single file containing both episodes. Or the range “S01E01.1-E01.3” for example with Animaniacs, where three episodes are often contained in one file:
https://www.thetvdb.com/index.php?lid=7&seasonid=4929&seriesid=72879&tab=season&order=dvd

Please let me know if there’s any other information I can provide. Thank you for responding!

@svensayshi

How should Plex parse NCIS.S01E01.Pilot.mkv and how should it parse NCIS.S01E01.1.Probie.mkv ?

Do you see the conflict? The parser is an automaton. It has no intuition and cannot ‘reason out’ the name as human’s do.

In that scenario I’d suggest that if it finds a period, check the next single character for a 0-9 digit and if found then include it in the episode name (possibly for DVD orders only, unless that’s too complex or not an option). While this might lead to dropped matches if the episode name begins with a digit, that’s not the approved file naming convention anyway and people could easily correct the problem by renaming their files, which is more than I can say for the problem we’re trying to solve here :).

What needs to happen is a sequence of regular expressions (regex) which lead to a definitive result. working out the state change rules is the trick. "At what point does state 1 no longer apply and state 2 engage. If you’re familiar with FSM’s same princicple.

Yeah I understand what you mean. Maybe a symbol before the episode number to flag the state change. For example S03E’15.1’ with apostrophes before and after the episode number. So if ever the parser comes across SXXE followed by an apostrophe, it can capture the full text until the next apostrophe and use that as the episode number to scrape TheTVDB. This would cause no conflicts with already existing filename conventions.

It currently breaks roughly ^.*[\.\-\ ]S[0-9]{1,2}E[0-9]{1,3}[\.\-\ ].*$

You can see what happens when 15.1 - (literal string)` would be encountered. It would break down. Now try "Star Trek 1x24.1 something " ?

I foresee regex parsing breaking down, I foresee a full FSM taking over. That’s a big strep and likely slow it down if not implemented perfectly.

Yeah I can generally grasp the problem there although the specifics went over my head. If there is a solution that wouldn’t bog down the code terribly and that you and your team would be willing to implement, it would make us Seinfeld watchers (among others) very happy.

I can tell you now, it won’t be done for just one series/interest. There is a need.

I know, but don’t know where to find it, there is a DVD-order agent somewhere. It might be in the Plug-ins / Channels forum. Between filebot naming in DVD order and it (when added to PMS and selected), does work. I worked with those who like FireFly to get PMS to work with that agent as well as get the agent updated.

Perhaps that is the first step / path of least resistance to try first ?