Script to regenerate video previews multi threaded?

This is fun!

That pull request re-compresses the jpegs twice, and Pillow’s optimize isn’t amazing. Two suggestions to consider:

  • Use ffmpeg to extract lossless BMP or PNG files, and then use MozJPEG to compress them to jpeg.

  • Or just lower the quality ffmpeg uses during extraction, and use MozJPEG (in lossless mode) to optimize them.

Ooooooooooooooooooooh :heart_eyes:

New Issue: Files with a “+” character in path (as recommended naming convention for multi-part TV episodes) fail.

Who recommended that?
Ref: https://support.plex.tv/articles/naming-and-organizing-your-tv-show-files/#toc-4

1 Like

Correction: As Sonarr defaulted to naming multi-part episodes, which I inaccurately thought also aligned with Plex recommendations. I’ll figure out how to fix that outside of Plex and the script.

1 Like

You’re right though - if failures occur due to any special characters, filename handling can probably be improved.

1 Like

thanks!! left some comments on the PR

I have played around a little and given this some thought. I agree that multiple compressions introduces unnecessary generational loss and is also less efficient for CPU clock cycles. There is also the reality that we are dealing with single 320x240 frames. You can’t take advantage of compression technologies seen in video and it is a small amount of pixels. Any gains will be comparatively small, especially weighed against the potential time to code all of the various optimizations. For those of us with massive libraries, the gains do extrapolate out, but I’m personally still not very motivated to spend hours and hours coding to save a megabyte per movie. As evidenced below, a change from the original setting of 3 down to 5 saved 2.5M for an 1:49 movie. A typical 1 hour TV episode which has 40 minutes run time might save 1M space from drastically lowering the quality.

Thus, I propose that the quickest, easiest solution is to make ffmpeg’s quality factor into a variable and allow the end user to set it as they choose with a recommended range, such as 2 - 6. That is a two line commit change and addresses all concerns.

Using a BIF frame interval of 5 seconds, an average movie resulted in the following bif file size for the corresponding quality factors:

q2 - 12M
q3 - 9.3M
q4 - 7.7M
q5 - 6.8M
q6 - 6.1M

q2 - 12M:
img-000065-q2

q3 - 9.3M:
img-000065-q3

q4 - 7.7M:
img-000065-q4

q5 - 6.8M:
img-000065-q5

q6 - 6.1M:
img-000065-q6

Just like video encoding settings, the “correct” answer is in the eye of the beholder.

1 Like

:+1: Fastest and easy. I agree it’s not a place where fidelity matters much. Some clients display it as a postage stamp.

FPS is the other primary knob for file size. On Roku the minimum interval between frames seems to be 10 seconds anyway. Apple TV is smaller (but 10 feels ok). I don’t know what Android’s minimum is.

Smaller BIF files do load faster on remote clients.

Agree. Frame interval is where the real space savings can be made. I’ll let @stevezau chime in before I make any more commits.

2 Likes

Thanks @jasonsansone i’ve merged the PR

3 Likes

Thank you for merging the PR. I am submitting a new PR which should:

  • Display unconfigured variable error on screen in addition to logging.
  • Fix regression in HDR detection.
  • Convert ffmpeg quality settings to a user configurable variable.

Turns out the problem wasn’t special characters. The issue was the multi-part episode. The script actually handles special characters, including a plus sign ("+") just fine.

1 Like

I’ve never looked, how does that work?

Does each file get an index-sd.bif? Or each episode+part tuple?

@dane22 is the expert and odds are high I will misspeak. However, it appears to me that every episode with metadata from tvdb receives its own db entry with corresponding thumbnails and metadata information even though those db entries may reference one media file. There is only one bundle hash and only one index-sd.bif per file. If you run the script with only one thread you won’t encounter issues because the index will be created on the first episode pass. On the next episode it will detect the index exists and skip it. Problems arise with multithreading because the script works on the same media file with the same working temp folder simultaneously. I added a commit which I believe fixes the issue. I tested the current version on a few thousand files of all various codecs, resolutions, etc. without errors.

1 Like

No, that answers my question!


This isn’t new, but I didn’t notice it before. This looks like a foot-gun to me.

        shutil.rmtree(TMP_FOLDER)   

Consider the user who sets TMP_FOLDER = '/tmp'.

Yes, it bit me the first time I used the script. I unknowingly nuked a folder. The new default is a dedicated subfolder in tmpfs. Considering how small the memory overhead is, I would suggest making the working a directory a variable that isn’t in the recommended user configurable section to avoid foot-gunning.

I’m suggesting that the script shouldn’t delete any directories it didn’t create - is that what you’re saying?

(Aside: /dev/shm is such a weird Linuxism.)

I didn’t write the original code, but I understand the logic. It creates a working folder and then cleans up after itself. It cleans up on start because a crashed or closed script won’t have cleaned up. A dirty working directory may contain bundle folders from failed prior executions which can interfere. I think a simple fix that is safer is to designate a folder but always create a sub-folder in that path as opposed to define the entire directory path.

1 Like

Correct

This is where we come to the fact that I haven’t tested, since the info in the database is stored under the part item ( indexes="sd" )

So actually no idea if PMS will create one big index-sd.bif file for all the parts, or only the last part survives ?

And since you @jasonsansone apparently has the full setup to test that, do let me know if it’s not one size fits all bif file, since if not, I’ll setup a test case, duplicate and report as a bug

1 Like

I haven’t fired up SQLite3 to poke around the database, everything is anecdotal or externally observed. From my experience, Plex always treats one episode in the database as a one file until it’s watched, regardless of if the file spans multiple episodes. For example, when watching a cartoon which may contain multiple segments (ie Paw Patrol) in one file, Plex will mark all episodes marked as played if the file is completely watched but only indicate progress on the first episode when the file is partially watched. The first episode may be completed and the user viewing the second, but Plex has no way of knowing where such transition actually occurs inside that media file.

Regarding indexing, ffmpeg is fed the entire media file and processes it at full length. It’s one bif for the file. The episode parts is irrelevant and only an “issue” when this script processed the file more than once. There is no issue in Plex. The bundle is complete and indexing works in Plex for the full length of the media file. I don’t know if you use a single pointer tied to the file or reference the index in db more than once, but everything is fine with the new commit I made for the script.

1 Like