I’m reaching out regarding a recurring issue with my Plex Media Server and would appreciate some assistance in troubleshooting and resolving it.
Issue Description:
In the Plex Media Server logs, I’m consistently seeing warnings indicating that a transaction is being held for too long. An example of the warning message is as follows:
javascript
Copy code
May 21, 2024 19:05:39.317 [22405911681848] WARN - Held transaction for too long (/home/runner/actions-runner/_work/plex-media-server/plex-media-server/Statistics/StatisticsManager.cpp:294): 0.290000 seconds
This continually increases and after a few days, the transaction hold time escalates to several seconds.
Despite optimizing the Plex database and ensuring adequate server resources, including ample RAM and processing power, I’m still encountering this warning. I suspect it may be related to the size of my media library and the ZFS configuration.
The configuration you have is not at all efficient.
Record size = 34KB is 8x SLOWER than the ZFS default. They selected 128KB for a reason. The ‘cost’ of a ZFS file system read (overhead) is fixed. By forcing it to make 8x as many reads for the same amount of data, you’re forcing 8x the overhead.
– This is like getting a gallon of water a tablespoon at a time versus a cup at a time.
Database Cache size is where you’re trying to compensate by loading 9GB of the DB into memory. At your quantities, the entire DB is already loaded into memory. You can only access it 34KB at a time. That’s the problem.
I don’t know how easy it will be for you to reconfigure but you need to:
Create a new dataset at the default 128KB record size
Transfer all of the Plex data to that new dataset.
Start Plex,
– RESET the Database Cache Size back to default.
– SAVE
– Restart Plex one more time.
What this does for you:
Disk I/O efficiency will be increased 8 fold. (you’re not reading partial cylinders – 34KB at a time. You’ll get a far better use of RAM cache and physical disk cache (the cost of physical disk read is expensive – time)
ZFS is Copy-On-Write based. It’s already optimized for 128 KB.
Setting the Plex database to a 65536 record size makes all physical I/O to the DB a multiple of the record size (a perfect half… which is the SQLite limit … and a huge performance boost on ZFS)
I still see the held transaction for too long in the logs. I haven’t had more than 1 user on. I’ll attach the logs again. Everything seems to be loading quick; not sure why this keeps increasing.
root@OfficePlex:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 63.4M 1 loop /lib
loop1 7:1 0 338.8M 1 loop /usr
loop2 7:2 0 1000G 0 loop /var/lib/docker/btrfs
/var/lib/docker
sda 8:0 1 57.7G 0 disk
└─sda1 8:1 1 57.7G 0 part /boot
sdb 8:16 1 119.5G 0 disk
└─sdb1 8:17 1 119.5G 0 part
sdc 8:32 0 1.8T 0 disk
└─sdc1 8:33 0 1.8T 0 part
sdd 8:48 0 1.8T 0 disk
└─sdd1 8:49 0 1.8T 0 part
md1p1 9:1 0 119.5G 0 md /mnt/disk1
nvme0n1 259:0 0 1.8T 0 disk
└─nvme0n1p1 259:1 0 1.8T 0 part
nvme2n1 259:2 0 1.8T 0 disk
nvme1n1 259:3 0 1.8T 0 disk
└─nvme1n1p1 259:4 0 1.8T 0 part
root@OfficePlex:~# lshw -short -C disk
H/W path Device Class Description
/0/100/6/0/0 hwmon1 disk NVMe disk
/0/100/6/0/2 /dev/ng0n1 disk NVMe disk
/0/100/6/0/1 /dev/nvme0n1 disk 2TB NVMe disk
/0/100/17/0 /dev/sdc disk 2TB Samsung SSD 870
/0/100/17/1 /dev/sdd disk 2TB Samsung SSD 870
/0/100/1b/0/0 hwmon3 disk NVMe disk
/0/100/1b/0/2 /dev/ng1n1 disk NVMe disk
/0/100/1b/0/1 /dev/nvme1n1 disk 2TB NVMe disk
/0/100/1d.4/0/0 hwmon2 disk NVMe disk
/0/100/1d.4/0/2 /dev/ng2n1 disk NVMe disk
/0/100/1d.4/0/1 /dev/nvme2n1 disk 2TB NVMe disk
/0/8/0.0.0 /dev/sda disk 61GB Cruzer Fit
/0/8/0.0.0/0 /dev/sda disk 61GB
/0/9/0.0.0 /dev/sdb disk 128GB Flash Drive FIT
/0/9/0.0.0/0 /dev/sdb disk 128GB
I don’t use ZFS for anything with live data (PMS & databases count as live data to me)
At this point, because ZFS compresses On-The-Fly (one record at a time),
When it reads & writes to the DBs, the time required to (de)compress each record will add up.
Therefore, I recommend an experiment.
Disable compression for the Plex metadata dataset
Go back into DBRepair, and again setting the record size = 65536, optimize it again so it’s no longer compressed.
Now run it…
I would like to see a full set of logs next (after it runs a while / it gets to the point of actual failure.
Failure will be when it shows “SLOW QUERY” in the server logs followed by timeout (HTTP 408)
I removed compression on the dataset and ran DBRepair again. It was still showing the same signs.
I have created another instance of plex on a new account 128 dataset and pagesize 65536 right from the start. It’s not showing the same characteristics with statisticmanager held time progressively increasing.
What would happen if I swap preferences.xml to the new plex container? And how risky is transferring watch history?
You created a whole new instance, reloading the media & metadata , and it’s not misbehaving ?
If this is true, you can move watch history in either of two ways.
If you move the Preferences.xml, you’d be moving a few of the preferences and, most importantly the server ID information. The new server would completely assume the identity the old one had.
Is there a performance issue with your system? Your thread is positioned as attempting to eliminate warning messages from your logs. But what is the actual problem you’re trying to solve? Slow searches? Slow library browsing? Something else?
I only ask because the warning messages you posted all pertain to statistics gathering, which should not impact real-time performance of the system.
Hi-
Yes, this is exactly what I was thinking. Could I use your script to move the watch history? If it’s to risky I’m fine losing it. I would rather have this db running perfectly.
Awesome. That’s the kind of thing you probably want to mention in your initial post to ensure the proper issue is being troubleshot (troubleshooted?). Correlation is not necessarily causation and all that.
Plex spouts out all kinds of warnings (and errors) in the logs which mean absolutely nothing, other than the developers are working on some new functionality.
By the way, have you not seen any performance increase since allowing DBRepair.sh to re-index your database? That generally cures a large number of browsing and searching ills. I’d be very surprised if you were an exception, as there have been much larger databases than yours “fixed” by this simple action. Except in the cases where there was severe database damage.
But, mom always told me I was special. Maybe you are, too.