Plex crashing multiple times per night, during peak usage

I’ve been running my Plex server without issue for many many years. I’m running it on enterprise level hardware (EPYC 7443P, Nvidia a4000 (hw accel), RAID10 Optane 905p drives for plex appdata. All of a sudden over the past week or so, my Plex server has been crashing during peak usage (9-10pm). Sometimes it recovers on it’s own after a few minutes, sometimes I have to restart the docker container to bring it back. Please see the attached logs.

Server Version#: 1.40.2.8395

Plex Crash Uploader.1.log (376 Bytes)
Plex Crash Uploader.log (534 Bytes)
Plex Media Server.3.log (4.2 MB)

Debug level logs are needed.

  1. Configure Plex for debug, not verbose, level logs.
    Settings → Server_Name → General

  2. Restart Plex Media Server.

  3. After the next incident

  • Restart PMS
  • Wait 2 - 3 minutes for it to fully start up.
  • Pull log files (Settings → Troubleshooting).
  • Attach the entire ZIP file to the thread.

Will do. Thank you!

There are a lot of errors in the Plex Media Server.3.log file you attached. The slow query warnings are also of interest. The debug level will provide additional details.

When you restart PMS it bumps the log files. Plex Media Server.log → .1.log, .1.log → .2.log, etc. (.5.log → bit bucket).

So, after a restart, Plex Media Server.log will have the startup sequence, to show if there are any issues when PMS starts. The bottom of Plex Media Server.1.log will show what was happening when it crashed.

Verbose level logs are rarely needed. Enabling verbose level adds a lot of information to the log files. It makes them more difficult to read. It also causes them to wrap much faster, possibly losing desired information.

I was concerned about the slow query warnings as well. I’ve run the DBRepair utility multiple times over the past few weeks but that does not seem to have resolved the issue. Hopefully I’ll get more detailed info with the debug logs from the next crash.

Have you tried conducting an IO test on your drive?

On the drive that is housing my Plex app data you mean?

If I may add?

  1. perform a dd if=/dev/sdXX of=/dev/null bs=4M count=20000 status=progress
    test of the device holding the Plex data (not the OS). (edit ‘XX’ appropriately)

  2. DBRepair will put the DB in perfect contiguous and sorted order.
    If there are still a lot of “SLOW QUERY” statements then
    – The device’s performance might be the issue
    – The number of “metadata_items” stored in com.plexapp.plugins.library.db could indeed be extreme

  3. If the SLOW QUERY happens when this type audio is transcoding… then there are other issues
    – Choice one is to reload the codecs
    – Choice two is crappy media.
    – Choice three is the hardware

  4. This output tells me: Damaged Codecs or Bad Media.

Apr 25, 2024 04:16:03.548 [140710532324152] ERROR - [Req#925342/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.548 [140711167306552] ERROR - [Req#925343/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (67) exceeds limit (42).
Apr 25, 2024 04:16:03.548 [140711078337336] ERROR - [Req#925344/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.549 [140710532324152] ERROR - [Req#925345/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Pulse data corrupt or invalid.
Apr 25, 2024 04:16:03.549 [140711167306552] ERROR - [Req#925346/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.549 [140711078337336] ERROR - [Req#925347/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Pulse data corrupt or invalid.
Apr 25, 2024 04:16:03.549 [140710532324152] ERROR - [Req#925348/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.549 [140711167306552] ERROR - [Req#925349/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (61) exceeds limit (42).
Apr 25, 2024 04:16:03.550 [140711078337336] ERROR - [Req#92534a/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.550 [140710532324152] ERROR - [Req#92534b/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] TNS filter order 18 is greater than maximum 12.
Apr 25, 2024 04:16:03.550 [140711167306552] ERROR - [Req#92534c/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.550 [140711078337336] ERROR - [Req#92534d/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (15) exceeds limit (12).
Apr 25, 2024 04:16:03.551 [140710532324152] ERROR - [Req#92534e/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.551 [140711167306552] ERROR - [Req#92534f/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (52) exceeds limit (42).
Apr 25, 2024 04:16:03.551 [140711078337336] ERROR - [Req#925350/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.551 [140710532324152] ERROR - [Req#925351/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (65) exceeds limit (42).
Apr 25, 2024 04:16:03.551 [140711167306552] ERROR - [Req#925352/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.552 [140711078337336] ERROR - [Req#925353/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] TNS filter order 26 is greater than maximum 12.
Apr 25, 2024 04:16:03.552 [140710532324152] ERROR - [Req#925354/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.553 [140711167306552] ERROR - [Req#925355/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (46) exceeds limit (42).
Apr 25, 2024 04:16:03.553 [140711078337336] ERROR - [Req#925356/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.553 [140710532324152] ERROR - [Req#925357/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (44) exceeds limit (42).
Apr 25, 2024 04:16:03.554 [140711167306552] ERROR - [Req#925358/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.554 [140711078337336] ERROR - [Req#925359/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Pulse data corrupt or invalid.
Apr 25, 2024 04:16:03.554 [140710532324152] ERROR - [Req#92535a/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.554 [140711167306552] ERROR - [Req#92535b/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (16) exceeds limit (12).
Apr 25, 2024 04:16:03.554 [140711078337336] ERROR - [Req#92535c/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.555 [140710532324152] ERROR - [Req#92535d/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (51) exceeds limit (42).
Apr 25, 2024 04:16:03.555 [140711167306552] ERROR - [Req#92535e/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] Error while decoding stream #0:1: Invalid data found when processing input
Apr 25, 2024 04:16:03.555 [140711078337336] ERROR - [Req#92535f/Transcode/5a03277d-c6cf-445d-9193-c7dd3b01b5f5/6975d362-89ab-4f97-b03c-38db28adf486] [aac @ 0x7f0e87313ac0] Number of bands (45) exceeds limit (42).

As I said in my OP, I’m running my plex appdata on a RAID10 zpool of 4 x Intel Optane 905p NVMe’s. Here is the performance of each of them indvidually. Reads should be roughly 4 times the speed of one of these tests given it’s a RAID10 pool.

# dd if=/dev/nvme1n1 of=/dev/null bs=4M count=20000 status=progress
81713430528 bytes (82 GB, 76 GiB) copied, 37 s, 2.2 GB/s
20000+0 records in
20000+0 records out
83886080000 bytes (84 GB, 78 GiB) copied, 37.985 s, 2.2 GB/s
# dd if=/dev/nvme2n1 of=/dev/null bs=4M count=20000 status=progress
81763762176 bytes (82 GB, 76 GiB) copied, 37 s, 2.2 GB/s
20000+0 records in
20000+0 records out
83886080000 bytes (84 GB, 78 GiB) copied, 37.9895 s, 2.2 GB/s
# dd if=/dev/nvme3n1 of=/dev/null bs=4M count=20000 status=progress
81705041920 bytes (82 GB, 76 GiB) copied, 37 s, 2.2 GB/s
20000+0 records in
20000+0 records out
83886080000 bytes (84 GB, 78 GiB) copied, 37.9899 s, 2.2 GB/s
# dd if=/dev/nvme6n1 of=/dev/null bs=4M count=20000 status=progress
81939922944 bytes (82 GB, 76 GiB) copied, 37 s, 2.2 GB/s
20000+0 records in
20000+0 records out
83886080000 bytes (84 GB, 78 GiB) copied, 37.8899 s, 2.2 GB/s

How do I determine how many metadata items are in com.plexapp.plugins.library.db and how many is considered extreme?

How do i reload the codecs? Simply delete the codecs directory and restart plex? I can’t see why it would be my media. Most of my media is high quality. Majority are remuxes.

  1. To obtain the count of total indexed media items.
    – cd into the ‘Databases’ directory
    sqlite3 com.plexapp.plugins.library.db
    select count(*) from media_items;
    select count(*) from metadata_items;

Looks like this:

[chuck@glockner Plex Media Server.2000]$ cd Plug-in\ Support/Databases/
[chuck@glockner Databases.2001]$ sqlite3 com.plexapp.plugins.library.db
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
sqlite> select count(*) from media_items;
177768
sqlite> select count(*) from metadata_items;
162917
sqlite> 

add the results for number of records. media_items represents actual pieces of media.

To reload the codecs.

  1. Stop Plex
  2. Get into the ‘Codecs’ directory under “Plex Media Server”
  3. Delete all files & directories EXCEPT the .device_id file. This is your license.
  4. Start Plex
  5. Each codec will download fresh the first time it’s needed.

May I see your logs from DBRepair? It will contain statistics

# sqlite3 com.plexapp.plugins.library.db
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
sqlite> select count(*) from media_items;
146814
sqlite> select count(*) from metadata_items;
99206

This is the only log I see. Doesn’t contain much.
DBRepair.log (1.3 KB)

Do you delete it all the time?

It’s supposed to accumulate information

This is telling me you’ve got a fair amount, but not excessive media.
(time required)

2024-04-21 20.06.14 - Auto    - START
2024-04-21 20.06.32 - Check   - Check com.plexapp.plugins.library.db - PASS
2024-04-21 20.06.38 - Check   - Check com.plexapp.plugins.library.blobs.db - PASS
2024-04-21 20.06.38 - Check   - PASS

If you’re using ZFS, there is a blocksize / record size for your filesystem.
You can use DBRepair to make the DB use that same recordsize
This will improve ZFS performance.

I don’t delete all the time but I think I did clean it up last week. How do I use DBRepair to match the same record size as my pool?

In the README file, I document the DBREPAIR_PAGESIZE variable.

Simply,

  1. Get the recordsize for the dataset/filesystem
  2. export DBREPAIR_PAGESIZE=xxxxxx (must be a power of 2 up to 65536)
  3. Invoke DBRepair
  4. “auto” optimize.
    – It will optimize again – creating the new DB file with the given page size.

From the README –

[2024-01-14 17.25.48] Exporting current databases using timestamp: 2024-01-14_17.25.35
[2024-01-14 17.25.48] Exporting Main DB
[2024-01-14 17.25.59] Exporting Blobs DB
[2024-01-14 17.26.00] Successfully exported the main and blobs databases.  Proceeding to import into new databases.
[2024-01-14 17.26.00] Importing Main DB.
[2024-01-14 17.26.00] Setting Plex SQLite page size (65536)
[2024-01-14 17.26.29] Importing Blobs DB.
[2024-01-14 17.26.29] Setting Plex SQLite page size (65536)
[2024-01-14 17.26.30] Successfully imported databases.
[2024-01-14 17.26.30] Verifying databases integrity after importing.
[2024-01-14 17.27.43] Verification complete.  PMS main database is OK.
[2024-01-14 17.27.43] Verification complete.  PMS blobs database is OK.
[2024-01-14 17.27.43] Saving current databases with '-BACKUP-2024-01-14_17.25.35'
[2024-01-14 17.27.43] Making repaired databases active
[2024-01-14 17.27.43] Repair complete. Please check your library settings and contents for completeness.
[2024-01-14 17.27.43] Recommend:  Scan Files and Refresh all metadata for each library section.
[2024-01-14 17.27.43]

These DBs now have a pagesize (record size) of 65536

Does the export DBREPAIR_PAGESIZE command work inside of docker?

My record size is 128k, so you’re saying I can only go up to 64k?

# zfs get recordsize optane/appdata
NAME            PROPERTY    VALUE    SOURCE
optane/appdata  recordsize  128K     default

yes,

  1. You can set environment variables for the container to start.
  2. You can use the ‘export’ command inside the container
  3. You can set the variable right before invocation, e.g.
$ export DBREPAIR_PAGESIZE=xxxxx
$ ./DBRepair.sh stop auto start exit

Ok great, will do this shortly. So if my recordsize is 128K, I should use a value of 65536?