Resuming playback causes TLS connection: Too many open files errors and eventual server crash

server-synology

#1

Synology 918+
PMS Version 1.13.0.5003 (64 bit)
Client - doesn't matter. I've reproduced it on everything I have, Roku, XBone, Mac, Chrome, etc.
This is direct play, no transcoding

When playing from the beginning of any media everything works fine. Resuming the same media at any point causes PMS to loop requests on different ports until the server crashes.
There is no other activity on the server, I clean started everything with nothing in the logs. It's 100% reproducible every time.

Here is part of the loop. I can provide the logs if required.

May 03, 2018 21:22:07.598 [0x7f03febaa700] DEBUG - Request: [192.168.1.177:64410 (Subnet)] GET /library/parts/80502/1424622621/file.m4v (667 live) TLS Signed-in Token (myphsto)
May 03, 2018 21:22:07.599 [0x7f03febaa700] DEBUG - Request range: 30760948 to 0
May 03, 2018 21:22:07.604 [0x7f03febaa700] DEBUG - Content-Length of /volume1/share/Movies/480p.x264.AAC.m4v is 1901023677.
May 03, 2018 21:22:07.668 [0x7f0407d11700] DEBUG - Auth: authenticated user 1 as myphsto
May 03, 2018 21:22:07.668 [0x7f03fdff2700] DEBUG - Request: [192.168.1.177:64411 (Subnet)] GET /library/parts/80502/1424622621/file.m4v (668 live) TLS Signed-in Token (myphsto)
May 03, 2018 21:22:07.668 [0x7f03fdff2700] DEBUG - Request range: 29351085 to 0
May 03, 2018 21:22:07.674 [0x7f03fdff2700] DEBUG - Content-Length of /volume1/share/Movies/480p.x264.AAC.m4v is 1901023677.
May 03, 2018 21:22:07.741 [0x7f0407fff700] DEBUG - Auth: authenticated user 1 as myphsto
May 03, 2018 21:22:07.741 [0x7f03fe5ce700] DEBUG - Request: [192.168.1.177:64412 (Subnet)] GET /library/parts/80502/1424622621/file.m4v (669 live) TLS Signed-in Token (myphsto)
May 03, 2018 21:22:07.741 [0x7f03fe5ce700] DEBUG - Request range: 30789711 to 0
May 03, 2018 21:22:07.747 [0x7f03fe5ce700] DEBUG - Content-Length of /volume1/share/Movies/480p.x264.AAC.m4v is 1901023677.


#2

May I have the entire log ZIP please? (settings - server - help - download logs) There isn’t enough from your excerpt. I need to see what precedes the crash.


#3

Some info:

make sure logs go back to when the (xx live) count started climbing from levels of 50 or lower

There is a serious issue (most likely a deadlock) that eventually led to this and the (667 live) / (668 live) etc showing on the requests

If this can happen soon after launch of PMS - then no need to increase number of log files - just make sure you start the test to reproduce the problem after a relaunch of Plex Media Server


#4

The (667 live) / (668 live) goes up to 4096 (ulimit for the plex user) and then it crashes for good. Around 2000 or so I start getting the TLS errors. I have a clean log showing from startup, playing the same media from the beginning, stop, resume and the loop starting before it gets to crash levels if that helps.
I’m also attaching the log where I let it spin until the TLS errors start but I stop it before the actual crash.


#5

Any feedback / insight on this one? I see PlexMediaServer-1.13.0.5023-31d3c0c65-x86_64 is out for Syno so I’ll upgrade tonight and test this scenario again.


#6

The upgrade did not fix this issue. I did a database rebuild just to be sure there wasn’t anything causing issues but it’s still happening.
I have determined I can resume some of my media, but I have yet to determine a pattern to why some media will resume and others will not. I’ve done the plex shuffle on the media which will not resume and it still causes the loop and crash.


#7

Was the client app on 192.168.1.177 Plex Media Player ? What version?
And does the issue only arise when Direct Play used?


#8

The client was 192.168.1.177. It is a Mac Mini on OSX Sierra running Plex Media Player 2.10.0.849 64 bit. I can give the full system info if you need.

In this case direct play was being used according to the server status, I’m sure you can tell from the logs what the mode was.


#9

@myphsto said:
The client was 192.168.1.177. It is a Mac Mini on OSX Sierra running Plex Media Player 2.10.0.849 64 bit. I can give the full system info if you need.

In this case direct play was being used according to the server status, I’m sure you can tell from the logs what the mode was.

I could not reproduce the problem using same versions of Plex Media Player and Plex Media Server.

I suspect that there may be an underlying problem eg some disk drive / network issue that is causing a request to hang and once that happens then repeat requests are made and they start to stack up

Is NFS involved at all?

The log shows you are stopping the playback and then resuming later. I tried to reproduce the problem doing that but I did not get any repeated requests. So I think a hung request is probably leading to the issue.

As the problem is easily to reproduce for you, could I get verbose logging added as well so I can see the Plex Media Player attempts to download the media chunks. Settings / Server / General / show advanced.

Also Plex Media Player logs as well
For Plex Media Player logs - see https://support.plex.tv/articles/207338748-plex-media-player-logs/

Could you also look into network / disks to see if there is anything that may cause requests to hang (and result in the repeats)


#10

NFS is not involved, PMS is running direct on local storage on the Syno. The start / stop you see in the logs is to reproduce the issue. The media plays perfectly fine from the start. All operations work (fast forward, pause etc) as long as the media was played from the beginning. To reproduce I start the media until it’s a few minutes in, stop it and then resume.
There are no network issues, everything is wired no drops or resends of packets. This issue started only after the latest PMS update.
I’ll get the verbose logs in a bit. Given they are so big I’d like to get a clean run of the issue with nothing else going on. Like I said I have found media which resumes fine let me know if you also want to logs for that.


#11

I can also reproduce this the with the same media on Windows 10 Plex Media Player 2.10.0.849


#12

@myphsto said:
NFS is not involved, PMS is running direct on local storage on the Syno. The start / stop you see in the logs is to reproduce the issue. The media plays perfectly fine from the start. All operations work (fast forward, pause etc) as long as the media was played from the beginning. To reproduce I start the media until it’s a few minutes in, stop it and then resume.
There are no network issues, everything is wired no drops or resends of packets. This issue started only after the latest PMS update.
I’ll get the verbose logs in a bit. Given they are so big I’d like to get a clean run of the issue with nothing else going on. Like I said I have found media which resumes fine let me know if you also want to logs for that.

I cannot reproduce the problem. may be it is only specific media files that result in this behaviour. If this is the case, then will need a sample file that does replicate the issue as well,

And please don’t forget the Plex Media Player logs as well


#13

Sending you a PM with the logs and sample.


#14

Thank you for the info sent by PM - so it does appear to be only when resuming specific media files - will wait for you to see if you can get a reasonable size sample we can use to replicate the issue with Plex Media Player


#15

Quick update. It is isolated to the full Plex Media Player on Mac and Windows and while it is a subset of my media I have yet to find a pattern to which work and do not. I was able to upload some of the media to my Plex Cloud server and the issue also happens there with the same clients that are having issues on my Syno so this seems to be a client issue.
PM sent with some further details on the media in question, hopefully you can reproduce it.


#16

Thank you. I will try to reproduce the issue with the supplied m4v file


#17

@myphsto Whilst I have not been able to get the issue of severe Plex Media Server overload when resuming, I have, however, managed to capture the difference in how Plex Media Player does requests when resuming vs from beginning and it is very possible that running the server on a NAS as opposed to a desktop would give rise to the overload.

I do see that Plex Media Player makes many requests to the server when resuming compared to the one large data request from starting from the beginning.

I will refer the evidence I captured to the Plex Media Player team


#18

I need new set of logs from you - corresponding set from Plex Media Player on the mac and Plex Media Server on the NAS. When you added verbose logging for me for the server you had debug logging disabled. Please do the test after a fresh restart of PMS and PMP and no other activity on the server (eg scanning / other playback) and please do separate test from play back from beginning of file (for 5 minutes) and other test for resuming playback

Thanks


#19

Thanks for the extra logs

I have passed the diagnostics and my tests to the Plex Media Player development team


#20

Thanks for the support on this one. For anyone hitting this problem the easy work around is to just not resume media until it’s fixed. I have narrowed this issue to the full Plex Media Player 2.10.0.849 client. The browser player, XBOne player and Windows Metro/Windows Store version work just fine. It also does not happen on all media as I first reported, however I have not found a pattern for which media is impacted. I just happened to be on a movie marathon when I ran into it!