Repeated connection loss for local and remote users

My server connection is dropping out for both local (ethernet) and remote clients several times per day. This has been occurring for over a year. Whenever it occurs, playback buffers and the client reports that the server is unavailable. Usually after 5-10 minutes, connectivity is restored but sometimes I have to restart the service. The media server log shows lots of errors such as:
We appear to have lost Internet connectivity, resetting device URL cache
ERROR - CERT: Error in TLS handshake: short read
WARN - SLOW QUERY: It took 950.000000 ms to retrieve 20 items

There’s no reason the server should have lost internet connectivity, but I don’t know the meaning of the other errors. I have noticed other strange log entries that may be unrelated: It frequently reports warnings about UPnP/NAT PMP, even though I am using a manually specified port (my firewall does not support UPnP). Also just sometime in the last couple of months, the server has apparently started reporting my gateway address (10.55.55.1) or the local loopback address (127.0.0.1) as the IP for remote clients.

My Plex media server is version 1.7.2.3878-8088811b8 running on CentOS 7 VM with a single NIC, 6GB RAM and 4 cores of an i7-4790 CPU @ 3.60GHz

I am running a flat network with all devices on a single internal subnet of 10.55.55.0/24. The Plex server is at 10.55.55.12. The local Roku player is at 10.55.55.218. The firewall/NAT router is a Fortigate 60C at 10.55.55.1. I have cable internet service with a Motorola SB6141 (bridge) modem.

The latest event in the attached logs is at July 01, 2017 at around 16:40 and occurred to a remote user about 45 minutes into ‘Serenity’. When the user reported the outage to me, I found that I was also unable to reach the Plex management console locally, and PlexPy reported that it was unable to reach the Plex server, but I had no trouble connecting to the SSH console, or opening the log files and the plexmediaserver service reported no issues. After about 5 minutes, connectivity returned without any intervention.

Can anyone help? Logs attached

Starting with the easiest, which you can do while I’m reading your logs,

Hover over the Library panel to expose the ellipsis (...). Click it and see ‘Optimize Database’ in the dropdown. Optimize your database
Next go to Settings - Server - Schedule Tasks and make sure you have Database Optimization enabled there to prevent it from happening again.

Thanks, just performed optimization as instructed. I confirmed that scheduled task was already enabled to optimize the database every week

Did you recently add, move, or delete media, OR do you have a lot of active transcodes eating up your CPU resources?

I’ve added about 4 new titles today and deleted one. There have only been 1-2 transcodes today and CPU has been pretty steady at between 3-20%, with a couple of very brief spikes to 50-60%

First, The TLS handshake is the Roku disconnecting / losing connection (reset by peer). The Roku is disconnecting for some reason until I look further down and see where your connection is dropping BUT also your database is showing signs of being very fragmented as I outlined above.

Jul 01, 2017 16:35:07.640 [0x7fcaae3ff700] DEBUG - Request: [127.0.0.1:54224 (Loopback)] PUT /video/:/transcode/session/2d2c4f20e6311d80321e06914291cb28/386fde02-9d0e-4f20-9a74-cd935a5210eb/progress?progress=74.8&size=-22&remaining=-1&vdec_packets=93302&vdec_sw_ok=93296&speed=0.0&vdec_hw_status=0 (18 live) Signed-in Token (korzynski)
Jul 01, 2017 16:35:07.640 [0x7fcad2fff700] ERROR - Exception getting remote address: remote_endpoint: Transport endpoint is not connected
Jul 01, 2017 16:35:07.642 [0x7fcad27fe700] ERROR - CERT: Error in TLS handshake: Connection reset by peer
Jul 01, 2017 16:35:07.642 [0x7fcad2fff700] DEBUG - Auth: We found auth token (xxxxxxxxxxxxxxxxxxxx), enabling token-based authentication.
Jul 01, 2017 16:35:07.642 [0x7fcad2fff700] DEBUG - Auth: Came in with the master token, authorization succeeded.
Jul 01, 2017 16:39:02.894 [0x7fcad2fff700] DEBUG - EventSource: Failure in IdleTimeout (0 - Success).
Jul 01, 2017 16:39:02.894 [0x7fcad2fff700] DEBUG - MyPlex: We appear to have lost Internet connectivity, resetting device URL cache.
Jul 01, 2017 16:39:02.894 [0x7fcad2fff700] ERROR - EventSource: Retrying in 15 seconds.
Jul 01, 2017 16:39:02.895 [0x7fcaa53f8700] DEBUG - It took 90.000000 ms to retrieve 1 items.

Lastly, here is your system, sending Sernity to the remote Roku. The activity
progress=74.9&size=-22&remaining=-1&vdec_packets=93387&vdec_sw_ok=93381&speed=0.0&vdec_hw_status=0 (36 live) 0ms 342 bytes
Jul 01, 2017 16:39:02.907 [0x7fcaa8bff700] DEBUG - [Transcoder] Throttle - Going into sloth mode.
Jul 01, 2017 16:39:02.942 [0x7fcaa63fa700] DEBUG - Play progress on 17228 ‘Serenity’ - got played 3650000 ms by account 728504!
Jul 01, 2017 16:39:02.943 [0x7fcac6bfe700] DEBUG - Statistics: (2d2c4f20e6311d80321e06914291cb28) Reporting active playback in state 0 of type 4 (scrobble: 0) for account 728504
Jul 01, 2017 16:39:02.943 [0x7fcaa5bf9700] DEBUG - Client [2d2c4f20e6311d80321e06914291cb28] reporting timeline state playing, progress of 3682000/5202198ms for guid=com.plexapp.agents.thetvdbdvdorder://78874/1/1?lang=en, ratingKey=17228 url=, key=/library/metadata/17228, containerKey=, metadataId=17228
Jul 01, 2017 16:39:02.944 [0x7fcac93ff700] DEBUG - [Now] User is terirhan (ID: 728504)
Jul 01, 2017 16:39:02.944 [0x7fcac93ff700] DEBUG - [Now] Device is Roku (Roku 1 - 1XJ56A020198).
Jul 01, 2017 16:39:02.944 [0x7fcac93ff700] DEBUG - [Now] Profile is Roku-7.x
Jul 01, 2017 16:39:02.944 [0x7fcad2fff700] DEBUG - Completed: [10.55.55.1:52701] 200 GET /:/timeline?time=3492000&duration=5202198&key=%2Flibrary%2Fmetadata%2F17228&ratingKey=17228&state=playing&guid=com.plexapp.agents.thetvdbdvdorder%3A%2F%2F78874%2F1%2F1%3Flang%3Den&playQueueItemID=62892 (36 live) TLS GZIP 175411ms 710 bytes (pipelined: 1)
Jul 01, 2017 16:39:02.944 [0x7fcac93ff700] DEBUG - [Now] Updated play state for /library/metadata/17228.
Jul 01, 2017 16:39:02.945 [0x7fcaad3fd700] DEBUG - Play progress on 17228 ‘Serenity’ - got played 3671000 ms by account 728504!
Jul 01, 2017 16:39:02.971 [0x7fcac13f3700] DEBUG - Request: [10.55.55.1:51314 (Subnet)] GET /:/timeline?time=3692000&duration=5202198&key=%2Flibrary%2Fmetadata%2F17228&ratingKey=17228&state=playing&guid=com.plexapp.agents.thetvdbdvdorder%3A%2F%2F78874%2F1%2F1%3Flang%3Den&playQueueItemID=62892 (35 live) TLS GZIP Signed-in Token (terirhan)
Jul 01, 2017 16:39:02.971 [0x7fcad2fff700] DEBUG - Auth: We found auth token (xxxxxxxxxxxxxxxxxxxx), enabling token-based authentication.
Jul 01, 2017 16:39:02.971 [0x7fcad2fff700] DEBUG - Auth: Came in with the master token, authorization succeeded.
Jul 01, 2017 16:39:02.972 [0x7fcaa6bfb700] DEBUG - Request: [127.0.0.1:54706 (Loopback)] PUT /video/:/transcode/session/2d2c4f20e6311d80321e06914291cb28/386fde02-9d0e-4f20-9a74-cd935a5210eb/progress?progress=74.9&size=-22&remaining=-1&vdec_packets=93382&vdec_sw_ok=93376&speed=0.0&vdec_hw_status=0 (34 live) Signed-in Token (korzynski)
Jul 01, 2017 16:39:02.972 [0x7fcad27fe700] DEBUG - Completed: [127.0.0.1:54706] 206 PUT /video/:/transcode/session/2d2c4f20e6311d80321e06914291cb28/386fde02-9d0e-4f20-9a74-cd935a5210eb/progress?progress=74.9&size=-22&remaining=-1&vdec_packets=93382&vdec_sw_ok=93376&speed=0.0&vdec_hw_status=0 (34 live) 0ms 342 bytes
Jul 01, 2017 16:39:03.000 [0x7fcad2fff700] DEBUG - Auth: We found auth token (xxxxxxxxxxxxxxxxxxxx), enabling token-based authentication.
Jul 01, 2017 16:39:03.000 [0x7fcad2fff700] DEBUG - Auth: authenticated user 728504 as tn
Jul 01, 2017 16:39:03.017 [0x7fcac13f3700] DEBUG - Client [2d2c4f20e6311d80321e06914291cb28] reporting timeline state playing, progress of 3692000/5202198ms for guid=com.plexapp.agents.thetvdbdvdorder://78874/1/1?lang=en, ratingKey=17228 url=, key=/library/metadata/17228, containerKey=, metadataId=17228
Jul 01, 2017 16:39:03.018 [0x7fcad27fe700] DEBUG - Completed after connection close: [10.55.55.1:52704] 200 GET /:/timeline?time=3513000&duration=5202198&key=%2Flibrary%2Fmetadata%2F17228&ratingKey=17228&state=playing&guid=com.plexapp.agents.thetvdbdvdorder%3A%2F%2F78874%2F1%2F1%3Flang%3Den&playQueueItemID=62892 (34 live) TLS GZIP 175483ms 0 bytes (pipelined: 2)
Jul 01, 2017 16:39:03.020 [0x7fcaa2bf3700] DEBUG - Request: [10.55.55.1:51318 (Subnet)] GET /:/timeline?time=3713000&duration=5202198&key=%2Flibrary%2Fmetadata%2F17228&ratingKey=17228&state=playing&guid=com.plexapp.agents.thetvdbdvdorder%3A%2F%2F78874%2F1%2F1%3Flang%3Den&playQueueItemID=62892 (34 live) TLS GZIP Signed-in Token (terirhan)
Jul 01, 2017 16:39:03.021 [0x7fcaa2bf3700] DEBUG - Client [2d2c4f20e6311d80321e06914291cb28] reporting timeline state playing, progress of 3713000/5202198ms for guid=com.plexapp.agents.thetvdbdvdorder://78874/1/1?lang=en, ratingKey=17228 url=, key=/library/metadata/17228, containerKey=, metadataId=17228
Jul 01, 2017 16:39:03.031 [0x7fcad27fe700] DEBUG - Auth: We found auth token (xxxxxxxxxxxxxxxxxxxx), enabling token-based authentication.
Jul 01, 2017 16:39:03.031 [0x7fcad27fe700] DEBUG - Auth: authenticated user 728504 as t
n
Jul 01, 2017 16:39:03.058 [0x7fcad2fff700] DEBUG - Auth: We found auth token (xxxxxxxxxxxxxxxxxxxx), enabling token-based authentication.
Jul 01, 2017 16:39:03.058 [0x7fcad2fff700] DEBUG - Auth: authenticated user 728504 as t*****n
Jul 01, 2017 16:39:03.489 [0x7fcaae3ff700] DEBUG - Play progress on 17228 ‘Serenity’ - got played 3555000 ms by account 728504!
Jul 01, 2017 16:39:03.628 [0x7fcaa2bf3700] DEBUG - Play progress on 17228 ‘Serenity’ - got played 3713000 ms by account 728504!
Jul 01, 2017 16:39:03.649 [0x7fcaaaffd700] DEBUG - Play progress on 17228 ‘Serenity’ - got played 3618000 ms by account 728504!


All looking good until the slow query and then an apparent loss of your local internet connection here. 

Jul 01, 2017 16:39:31.007 [0x7fcaa9ffb700] DEBUG - NAT: UPnP, getPublicIP didn’t find usable IGD.
Jul 01, 2017 16:39:38.327 [0x7fcaa43f6700] WARN - SLOW QUERY: It took 780.000000 ms to retrieve 1 items.
Jul 01, 2017 16:39:41.010 [0x7fcaa9ffb700] WARN - NAT: PMP, timed out waiting for response.
Jul 01, 2017 16:39:41.010 [0x7fcaa9ffb700] DEBUG - MyPlex: Updating device connections (from timer: 0)
Jul 01, 2017 16:39:41.010 [0x7fcaa9ffb700] DEBUG - HTTP requesting PUT https://plex.tv/devices/3f78b6980c5e68f2ab2fbba083b5c4d1a03e6663?Connection[][uri]=http://10.55.55.12:32400&httpsEnabled=1&httpsRequired=0&X-Plex-Token=xxxxxxxxxxxxxxxxxxxx
Jul 01, 2017 16:39:41.726 [0x7fcaa9ffb700] DEBUG - HTTP 200 response from PUT https://plex.tv/devices/3f78b6980c5e68f2ab2fbba083b5c4d1a03e6663?Connection[][uri]=http://10.55.55.12:32400&httpsEnabled=1&httpsRequired=0&X-Plex-Token=xxxxxxxxxxxxxxxxxxxx
Jul 01, 2017 16:40:36.212 [0x7fcaa9ffb700] DEBUG - PublicAddressManager: Obtaining public address and mapping port.
Jul 01, 2017 16:40:36.212 [0x7fcaa3bf5700] DEBUG - PublicAddressManager: Obtaining public IP.
Jul 01, 2017 16:40:36.212 [0x7fcaa3bf5700] DEBUG - HTTP requesting GET http://plex.tv/pms/:/ip
Jul 01, 2017 16:40:36.232 [0x7fcaba3ff700] DEBUG - Sync: uploadStatus


One of you has a connectivity issue and from what I see here,  your computer / modem is the root cause.    I suggest start running some basic network stability and throughput tests ( Speedtest.net is a good basic starting point)

@korzynski said:
I’ve added about 4 new titles today and deleted one. There have only been 1-2 transcodes today and CPU has been pretty steady at between 3-20%, with a couple of very brief spikes to 50-60%

Sorry for the overlap.

that’s an insignificant load. I suggest pursuing the connectivity issue locally and at the remote but mostly with your equipment first. Goofy router or crappy cable (i just had to redo mine for ZERO logical reason haha)

Thanks, I’ll check my cabling and equipment for anything that could cause connectivity issues.

Regarding DB fragmentation, since weekly optimization is already enabled, is there anything else I can do to improve it?

Also, are the UPnP errors anything to be concerned about? Do you know why it’s complaining about UPnP when I’m using a manual port?

As I stated before… hover over the ‘Libraries’ panel on the left. The ellipsis will become visible. Click it and the menu opens. There you’ll see Optimize Database.

As I dig deeper into your logs, I’m seeing where NFS sessions are still dangling. The file is still open after PMS and the transcoder are done with them. It’s reporting an error because it can’t delete the now-empty session directory.

Jul 01, 2017 15:07:01.966 [0x7fcac6bfe700] DEBUG - Whacked session 2d2c4f20e6311d80321e06914291cb28, 0 remaining.
Jul 01, 2017 15:07:01.966 [0x7fcaba3ff700] DEBUG - Transcoder: Cleaning old transcode directories.
Jul 01, 2017 15:07:01.969 [0x7fcaba3ff700] DEBUG - Transcoder: Cleaning old transcode directory: /var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Cache/Transcode/Sessions/plex-transcode-084a98cad0d783a009bddc5942ec5d43-239a2e13-ee22-49bd-b457-c5cb8e6c03ac
Jul 01, 2017 15:07:01.970 [0x7fcaba3ff700] ERROR - Transcoder: Failed to delete session directory (boost::filesystem::remove: Device or resource busy: "/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Cache/Transcode/Sessions/plex-transcode-084a98cad0d783a009bddc5942ec5d43-239a2e13-ee22-49bd-b457-c5cb8e6c03ac/.nfs00000000092a016300000006")
Jul 01, 2017 15:07:01.970 [0x7fcaba3ff700] DEBUG - Whacked session 2d2c4f20e6311d80321e06914291cb28, 0 remaining.

For this, either your sever is slow in deleting or the NFS close() command didn’t make it. Also indicative of a network or server issue.

I would suggest just giving everything a full restart, NFS server included.

Before doing this, Stop Plex and manually go in and delete those Transcode Sessions directories and files. Make sure they’re gone.
If you can’t delete them as root, your network problem is starting to show itself.

Thank you for the additional info about the NFS connections - that’s a huge help.

I did perform the optimization as you instructed, but my question is whether there is anything additional I should be doing, since it was already set to automatically optimize it on a weekly basis. If the database is heavily fragmented even with weekly optimization and no significant media additions/deletions, then is there something else I can do to improve its performance?

once it’s been manually optimized and then set for weekly, unless you add, remove, or move a lot of media there is no need to further worry about it.

I would keep an eye on the logs from time to time. If you’re still getting ‘SLOW QUERY’, we can do a fully manual step (export and re-import the database with PMS offline). It’s invasive but does help. Should that be needed, I’d be more concerned as to the root cause.

At this point… let’s cross our fingers and just monitor it… You’ll know in 24/48 hours if it’s still happening. Something added, some screwy plug-in, something going awry. Either way, we’ll find it