I’ve been struggling to fix this issue for the past few months. I finally got some time to try and dig into the issue, and I’ve been unable to alleviate it myself.
PMS is running on a VMware virtual machine - Ubuntu 16.04.3 LTS
Virtualization host is running an i7-3820 Quad-Core CPU @ 3.60Ghz, I have 4 virtual sockets assigned to the VM with 12GB of memory. Plenty of HDD space for temp transcode files on the VM itself.
Virtual machine files reside on a Synology NAS with 4Gbps connectivity locally.
This issue started about 2 months ago, out of the blue. I’ve monitored the VM, storage and network during this and nothing appears to be out of the ordinary from a resources standpoint.
When I start playback of a video locally or remotely, everything goes great. While monitoring through PlexPy, I can see transcoding occuring at a speed between 10-30 with intermittent throttling changing the speed to 0.0 (from what I have read is normal). Transcoding percentage is always well ahead of playback percentage as well.
After approximately 5-10 minutes, PlexPy will then show the video as “Direct Play” for Stream, Video, as well as Audio, and the video will suddenly stop about 30sec to 1 minute later and the state of the video in PlexPy will change to “buffering.”
The video never starts playing again and I have to close the player within PlexWeb and start the video over again in the menus. It does accurately track where the video stopped so I am able to resume playback.
This is occurring for all users. The issue does not exist when using Plex Media Player with Direct Play.
Any help in figuring this out would be greatly appreciated!
Your resultant transcoder temp, while being in the VM is also over the network. Is local_lock=posix enabled ? Also, iNotify will not work over the network.
Is your virtual network adapter set to VMNET3 or e1000e?VMNET3` has been known to cause unpredictable errors with Ubuntu.
That having been said, I do not see a hard error or drop.
Next time you observe the error, may I have the entire log file set?
The file system on my NAS with all my media is mounted to the VM over CIFS, not NFS. I realize you may be referring to the VM itself, but I’m not entirely sure where I would look up this setting from an ESXi to VM standpoint.
I’m am using a VMXNET3 network adapter with a single 10Gbps virtual link to the DVS. I suppose I could try to create a bond between several e1000e adapters if this ends up being the issue.
I’ve had the VM configured this way since I first created it a few years ago…I hope this configuration didn’t just recently become an issue…
I just updated to 1.10.0.4516 and the issue still persists.
I’m able to re-create the issue EVERY time I stream a video that requires transcoding. Full logs attached.
I raid the VMXNET3 adapter issue because, when presented to a Ubuntu guest, issues arise. I have an announcement in Linux Tips (top of the Linux forum) which recommends against using the adapter in Ubuntu for this reason.
Is the e1000e adapter limited artificially to 1 Gb in software ?
If the VM data is over the LAN, this is the source of your timeout / stall errors and reflected here. Where’s the root fs (/var/llib/plexmediaserver) ?
1 Database reports as unacceptably slow
Nov 27, 2017 21:05:06.033 [0x7efc351d8700] WARN - SLOW QUERY: It took 6050.000000 ms to retrieve 58 items.
Nov 27, 2017 21:20:35.846 [0x7efc213f7700] WARN - Got a request to stop a transcode session without a session GUID (or with an invalid one).
Nov 27, 2017 21:29:33.918 [0x7efc327fe700] WARN - Got a request to stop a transcode session without a ses
2 Network stall / failure of kernel to communicate with adapter
Nov 27, 2017 21:37:50.408 [0x7efc367ff700] ERROR - EventSource: Retrying in 15 seconds.
Regarding 10 Gbps: Do you really need 1GB/sec network speed into PMS? That’s a significant number of simultaneous playback streams which your log file activity doesn’t show is needed. I recommend trying the other adapter type and retesting
root fs resides with the VM files on the NAS, so yes, from my ESXi host, it would definitely be going over the LAN to run the VM.
I’ll try switching over to a few bonded e1000 adapters and see if that changes anything. I have had up to 8 simultaneous streams going before, but that isn’t very common.
Let’s assume you have 10 simultaneous instances, each needing 80 Mbps (HEVC HDR - type video), this would be 800 Mbps and safely reside in a single 1 GbE connection.
Well, I just switch over to a e1000 adapter, and it doesn’t appear to have fixed anything. Same issue. I did get a “shaka1001 (Network)” error, which I failed to mention I received before. Apologies.
Okay, if that’s the case, I didn’t actually receive the shaka error until I upgraded to PMS 1.10.0.4516. Previous versions did not give me this error.
Issue has remained identical throughout however.
Other than messing around with the network adapters earlier and upgrading/downgrading PMS, I’ve put everything is exactly as it was before I started all this troubleshooting today. The only exception would be instead of using the VMXNET3 adapter, I’m now using dual bonded e1000 adapters.
Although, It’s possible I simply didn’t see the error message to begin with…
The issue itself is still identical (5-10min of playback), now I just get that error through PlexWeb once the playback stops.
In the meantime, is there anything else you recommend I do?
Sure, I can stream from my iPhone over cellular. I’m not sure if it’s already within the logs I sent you, but there are a couple outside users that have have been streaming from my PMS throughout the day as well and have had the same issue.
Did you address your database optimization issue? (manually optimize it)
I am asking this again because if the VM’s root fs is on the network, you have network latency regardless of protocol used.
“EventSource” errors are always indicative of network connectivity problems.
This is a warning of pending issues: Nov 27, 2017 14:56:51.238 [0x7f5cd0bf6700] WARN - SLOW QUERY: It took 220.000000 ms to retrieve 27 items.
This is either network overload / communications problem / CPU overload and impending disaster:
Nov 27, 2017 19:46:59.456 [0x7efc1e3f1700] WARN - SLOW QUERY: It took 1340.000000 ms to retrieve 1 item Nov 27, 2017 21:48:17.675 [0x7efc15bff700] WARN - SLOW QUERY: It took 6140.000000 ms to retrieve 58 items.
58 items should be retrievable in less than 20 ms. not 6.1 seconds