Improve diagnostic capabilities and error reporting for network speed / connection issues

The error “Your connection to the server is not fast enough to stream this video. Check your Network.” seems to be a common message on clients and often it appears to be a catch all error, leaving the end user unable to identify what the underlying issues are. I believe this error message pops up in the following cases:

Real infrastructure errors

  • The network is actually too slow for the current bit rate
  • transient network load changing network behavior (say increasing latency)

Client side errors:

  • Client side buffer is not sufficiently large to handle a high latency but high bandwidth connection
  • Generic client errors that wedge the client’s network stack
  • Client side difficulty decoding a particular stream (reencoding at same bit rate has fixed, or turning off direct play/direct stream)

Server side errors:

  • Server’s transcoding is not able to keep up with the stream, could be storage I/O, CPU utilization, other processes (cron) consuming resources, etc.
  • Various other server side interferences creating latency issues or failing to provide the stream to the client fast enough

Primary objectives are to

  • Allow the end user and plex support to quickly identify if the network is the issue
  • Monitor for server side issues and notify the server admin with the details (storage I/O latency issues, too many clients, CPU under powered, CPU being hogged by other processes, etc) and improve error messaging on the client, indicating it may be a server side issue rather than blaming the network
  • If no network or server side issues are detected, indicate as much and prompt for support bundle from the client, notify the server admin to send in a support bundle as well.

– additional thoughts about experience / implementation

For users experiencing this behavior consistently, an easy to do first pass capability would be to include a network bandwidth test embedded in both the client and the server, something like iperf3. This test server should be as independent from the Plex Server as possible, to keep any issues affecting the main Plex processes from interfering with the network testing, launching an independent iperf3 (or similar) server as needed would likely be an excellent option. This would allow the end user to establish what typical network behavior between the client and server actually is and determine if this is an infrastructure issue (something the end user needs to work through) vs an issue with Plex.

Three ways to implement this network test, First is an offer to do a ‘tuning’ run when a client is first introduced to a new server, this could even offer to set the global maximum bit rate options on the client. Second is to offer a manual test, the manual method should include an option for an extended test. In the extended test the pipes are kept under pressure for at least 20 minutes and the client caps the bandwidth in stages, starting out at low bit rate and monitoring latency, escalating to a reasonable maximum bit rate (tunable by the client, but not more than the server will allow that particular client to stream videos at). Throughout the test latency should be monitored by the client and if significant latency bumps are noticed (enough to allow that client’s cache to drain more than 75% at the current bit rate, faster bit rates mean a same sized cache is effectively smaller), the test should decrease to a slower bit rate. The reason for doing a longer test of 20 minutes is to try root out any reoccurring but still transient network issues, as well as trigger any upstream service provider throttling behaviors, which may not kick in till a stream has been active for some time. The test log / results should be sent to the server admin and to Plex so you can aggregate this and improve the testing over time. This test method should also offer to make changes to the global maximum bit rate settings on this client, additionally the test should provide some form of report (graph + csv?) showing target bandwidth, average bandwidth achieved and latency achieved over the test to the end user and server admin to help them diagnose any infrastructure issues on their end. Third would be some form of prompt to launch a test when the client is actively seeing what appears to be network slowness. All of these cases will help the end user determine if there really is a network performance issue they need to resolve, or even side step encountering it by properly tuning the client up front. It should also help Plex quickly weed out non-plex related customer issues.

For the client/server related issues, you might have an independent connection where the client and the plex server can communicate out of band, using independent processes from the video server/consumer client if possible so that if the main process gets blocked it doesn’t artificially increase observed network latency. This link could be used to transmit notices to the client about server side errors and to do a continuous periodic latency check on the network while video is actively playing. Details about the current status of the server (transcode buffer level, client buffer utilization, last frame sent/recieved, current client status) can also be shared to identify if a stall is being caused by the server or client side code and log the issue and needed debug details. If latency on this link remains low then any errors are unlikely to be network induced and the end user should be prompted to send in a support bundle. The client could also notify the server of any issues it encounters and then the server notify the server admin so they can look into it.

Additional health checking could be added to the server side to detect things like other processes consuming too many resources (CPU / storage bandwidth / network / etc ), monitor for overload on the server caused by too many clients playing at once and so on. These more useful messages could be provided to the client if it encounters any stalls and/or queued for the server admin.

Hello, i got those messages since i change internet provider, all my network is the same, i use a macbook pro retina, with apple airport extreme and Lacie HSS attached as server, watch movies on TV or even in my MBP and get “Your connection to the server is not fast enough to stream this video. Check your Network.” message after few seconds or minutes. With old internet provider this didn’t happened ever, i tried connecting all system with cables and get the same issues. Any help? thanks

I cant read all that but if you install netdata on your plex server then most of what you need is easily available there, assuming that you are using linux that is.

If you want to go deeper into what is happening on the network then you can install ntop as well.

@MattTwinkleToes That helps for understanding the plex server’s view of the network, this is only a piece of it, and the latency jitter monitoring really needs to be something the plex app / server internally at different levels and are tracking so they can provide better error messages to end users. If you break the system down into the following timing components:

Client -> Server -> Client full path request (ClientFullPath)
Server -> Client -> Server full path request (ServerFullPath)
This might be better modeled by watching the average rate the client requests updates, depending on how the protocol works
Client / Server Network only latency (like a light weight ping but with enough network bandwidth utilization to be meaningful. Processing is independent of all video processing modules on the client or server) (NetworkPath)
Server request received -> server response sent (ServerInternal)

An error message selection tree might look like this:

If problem:
if NetworkPath == Bad:
return “Network latency increased significantly, check network”
elif ServerInternal == Bad:
return “Plex Server response time increased significantly, check plex server resource utilization”
elif ClientFullPath == Bad and ServerFullPath == Good:
return “Client not receiving responses from server fast enough, network latency looks OK, and server reports healthy, check network bandwidth utilization”
elif ClientFullPath == Good and ServerFullPath == Bad:
return “Client likely overloaded, possibly underpowered, try force transcoding on this stream? server and network appear to be clean”
elif ClientFullPath == Bad and ServerFullPath == Bad:
return “Client and Server both having difficulties but network seems ok, check server resource utilization, try downgrading the quality and pre-encode this stream at the clients target bit rate”
else:
return “Everything seems healthy, please enable logging on the appleTV and file a support ticket”

You could go further with the break outs inside the plex server for even more helpful messages that could be dumped to debug logs (and reference the logging in the client message saying to contact the server owner). Monitoring response times from individual components, response times from the underlying storage, monitor CPU load, etc.