SOLVED: Extremely high CPU usage and IO connecting to library from various clients

Server Version#: 1.15.2.793-782228f99
Player Version#: iOS Plex App 5.12

In the last few weeks, I began noticing difficulty connecting to my Plex Media Server from my iOS client (iPhone XS Max) and from the Official Plex app for Kodi. To troubleshoot, I installed the latest PMS version (1.15.2.793) but this actually made the problem much worse. With this version, I cannot load my Television library (9993 items) as the client attempts to connect for about a minute and finally says the server is unavailable. It also makes accessing the library much slower through plex.tv.

By way of background, the server has 20 cores/40 threads (2 x E52670v2 Xeons) with 128 GB of RAM. The database is stored on an ZFS SSD mirror and appears to be entirely cached in the ARC (ZFS RAM cache). With 1.15.2.793, attempting to connect to the library from the iOS client results in extremely high IO utilization. ARC shows approximately 500,000 IOPS/second (reading directly from the RAM cache), iotop shows as much a 500 MB/sec utilization from Plex Media Server, and htop shows many Plex Media Server threads, pegging as many as 8 cores at 100%. Although I also saw higher CPU and IO usage from Plex Media Server on 1.14.1.5488 (the latest version from the Ubuntu/Debian repository), it is to a much lesser degree than 1.15.2.793, and the iOS client can connect after a delay. Plex.tv connects within seconds on 1.14.1.5488.

I’ve attached photos of htop, iotop -o, arcstat (to show ARC hits) at baseline and during an attempt to access my library via the iOS app:





image upload

Prior to this testing, I verified that the database was not corrupt:

root@storage-server:/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Plug-in Support/Databases# zfs snapshot rpool-server/ROOT/debian@p lex_database_repair
root@storage-server:/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Plug-in Support/Databases# cp com.plexapp.plugins.library.db com.plexapp.plugins.library.db.original
root@storage-server:/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Plug-in Support/Databases# sqlite3 com.plexapp.plugins.library.db ā€œDROP index ā€˜index_title_sort_naturalsortā€™ā€
root@storage-server:/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Plug-in Support/Databases# sqlite3 com.plexapp.plugins.library.db ā€œDELETE from schema_migrations where version=ā€˜20180501000000ā€™ā€
root@storage-server:/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Plug-in Support/Databases# sqlite3 com.plexapp.plugins.library.db ā€œPRAGMA integrity_checkā€
ok
root@storage-server:/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Plug-in Support/Databases# systemctl start plexmediaserver.service

I also tried a repair but doing so results in the following errors:

  1. Mar 25, 2019 11:03:23.039 [0x7fa0f9b23800] DEBUG - Opening 20 database sessions to library (com.plexapp.plugins.library), SQLite 3.13.0, threadsafe=1
  2. Mar 25, 2019 11:03:23.045 [0x7fa0f9b23800] ERROR - SQLITE3:0x10, 8, statement aborts at 1: [PRAGMA journal_mode=WAL] attempt to write a readonly database
  3. Mar 25, 2019 11:03:23.045 [0x7fa0f9b23800] ERROR - Database corruption: sqlite3_statement_backend::loadOne: attempt to write a readonly database

Plex Media Server Logs_2019-03-25_22-34-06.zip (3.0 MB)

I recommend going back to your previous version and resolving the connection issues first
1.14.1 was built with the old CI. 1.15.x was built with the new and there are some issues with 1.15.2

I should also add, the iphone app (particularly the XS Max) has issues. Have you updated it recently?

Yeah, I immediately reverted to 1.14.1.5488 after this testing as 1.15.2.793 is all but unusable. Despite the fact that I’ve been using 1.14.1.5488 for months (apt shows it was installed on 12/20), this issue appears to have manifested only recently. Grafana shows a significant increase in ARC hit rate around 3/20. I’m not sure what, if anything, changed at that time. It’s also unclear why the repaired database is reported as read-only.

I updated the iOS app when we were troubleshooting this issue on IRC. It’s now on the latest version available from the play store. However, I had an older version on my iPad that loaded my TV library in a few seconds on 1.14.1.5488. I just updated to the new version and I’m getting the same Library unavailable message with a huge amount of IO. Maybe there’s something wrong with these new clients?

Note: I uploaded new logs from my testing of the iOS iPad client that is causing issues on 1.14.
Plex Media Server Logs_2019-03-26_00-03-47.zip (3.1 MB)

I just did the uninstall/reinstall of the app.
When it reloaded and did first-run, it asked me to select my default server.
I noticed I had a ā€˜dead’ server there with the same name as the live one. I recognize this to be the same machine but using a different PMS version.

Might this be part of what you’re seeing on the App?

As for server / app interaction, I am as much a user as you are. I actually rarely get to use the app on my phone.

Let me look at your logs, brb

No, I’m just seeing my server and that of my friends. I was able to connect to the Movie library and after waiting for Plex Media Server to finish churning at 100-200k IOPS/sec, I was finally able to access the TV database.

Your logs tell me the server is either running maintenance or just now Indexing and matching a ton of media.

Which is the case? Example: Neil deGrasse Tyron Ring a bell ?

Yeah, I have that in my library but it wasn’t added recently. In fact, nothing was added during or shortly before that testing. It seems to be repeatedly indexing the library. I have it set to do a full library scan hourly. Maintenance tasks are scheduled between 2 and 5 AM.

you have it set to do a full scan hourly? Are you adding media every hour?

No, I could disable the hourly scan or make it less frequent. However, I’m seeing this activity much more frequently than hourly. In fact, prior to this testing I had it scan the library every 15 minutes. Since the database and frequently accessed metadata is cached in RAM, the refresh only takes a few seconds.

I just tested a library update. The Movie library completed in around 3 seconds and the TV libraryin around 10 seconds. With the iOS app attempted login, this goes on for minutes on end with even higher ARC hit rates.

Do you see any indexing of media occuring in the first log?

Plex Media Server.log.

Line 8922:

 Mar 25, 2019 23:59:41.958 [0x7f04467fe700] DEBUG - Brought media local (metadata://themes/com.plexapp.agents.plexthememusic_42e20671747a46b385a364839d2385f46c8ed65e) in 0.0s.

through End of File (line 25848)

Mar 26, 2019 00:03:45.751 [0x7f04467fe700] DEBUG -    * Found episode with key S23E138
Mar 26, 2019 00:03:45.751 [0x7f04467fe700] DEBUG -  * Found episode s23e138
Mar 26, 2019 00:03:45.753 [0x7f04467fe700] DEBUG - Item 16505 (Rob Corddry) Scanning metadata graphic elements in XML file "seasons/23/episodes"/138.xml
Mar 26, 2019 00:03:45.754 [0x7f04467fe700] DEBUG - Item 16505 (Rob Corddry) Scanning metadata graphic elements in XML file "seasons/23/episodes"/138.xml
Mar 26, 2019 00:03:45.755 [0x7f0411bff700] DEBUG - HTTP requesting GET http://127.0.0.1:36491/system/agents/media/get?guid=com%2Eplexapp%2Eagents%2Ethetvdb%3A%2F%2F71256%2F23%2F138%3Flang%3Den&mediaType=4&url=metadata%3A%2F%2Fseasons%2F23%2Fepisodes%2F138%2Fthumbs%2Fcom%2Eplexapp%2Eagents%2Ethetvdb_b0f08097f4d225b26789fbdbc988ea5a1047edea
Mar 26, 2019 00:03:45.760 [0x7f0411bff700] DEBUG - HTTP 200 response from GET http://127.0.0.1:36491/system/agents/media/get?guid=com%2Eplexapp%2Eagents%2Ethetvdb%3A%2F%2F71256%2F23%2F138%3Flang%3Den&mediaType=4&url=metadata%3A%2F%2Fseasons%2F23%2Fepisodes%2F138%2Fthumbs%2Fcom%2Eplexapp%2Eagents%2Ethetvdb_b0f08097f4d225b26789fbdbc988ea5a1047edea
Mar 26, 2019 00:03:45.761 [0x7f04467fe700] DEBUG - Brought media local (metadata://seasons/23/episodes/138/thumbs/com.plexapp.agents.thetvdb_b0f08097f4d225b26789fbdbc988ea5a1047edea) in 0.0s.
Mar 26, 2019 00:03:45.761 [0x7f04467fe700] DEBUG - Loading information from media bundle for part ID 27399 from "/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Media/localhost/9/f09a9681e8509f00f7d53315fcb6aa7d61ad30c.bundle/Contents"
Mar 26, 2019 00:03:45.761 [0x7f04467fe700] DEBUG - There was a change for metadata item 16505 (Rob Corddry), saving.
Mar 26, 2019 00:03:45.761 [0x7f04467fe700] DEBUG - Updating metadata item (save) (Rob Corddry) with ID 16505
Mar 26, 2019 00:03:46.343 [0x7f048cbff700] DEBUG - Auth: authenticated user 1 as jasonwc
Mar 26, 2019 00:03:46.344 [0x7f04047f7700] DEBUG - Request: [::ffff:192.168.1.1:24615 (Subnet)] GET /diagnostics/logs (26 live) TLS GZIP Signed-in Token (jasonwc)
Mar 26, 2019 00:03:46.345 [0x7f04047f7700] DEBUG - Diagnostics: Building logfile zip

Is all metadata indexing / updating.

Frankly, you’re hammering on the drive pool.

Yeah, it appears it was indexing when I attempted the second test with the iOS iPad app. That could explain the slow performance. As for hammering the drive pool, there’s actually very little disk activity as 99.9% of these requests are being served from RAM. ARC shows ā€œData demand efficiency: 99.9 % 904.9M.ā€ However, my first test was properly run immediately after starting the server and indexing doesn’t appear to have been an issue there.

With that apparently resolved, I have some news to pass on to you in case you don’t know.

You’re using Debiam 4.9 (you know this part).

As of PMS 1.15.x, two things changed.

  1. glibc was bumped to 2.14.1 . Debian 6 doesn’t even have 2.14.1
  2. Official support is dropping init systems on Debian. Debian 8 is the minimum.

I am letting you know, per user request, I am writing new init support for PMS.
It will take a bit to get it working for all platforms but, if all testing passes, will be mainstream before long.

The only remaining requirements will be:

  1. glibc 2.14.1 - just to run PMS
  2. appropriate i915 kernel driver if QSV is to be used

Hi,

I’m actually using 9.8, the latest stable release of Debian. However, I am using kernel 4.9, which is likely what you’re referring to.

root@storage-server:~# cat /etc/issue
Debian GNU/Linux 9 \n \l

root@storage-server:~# cat /etc/debian_version
9.8

root@storage-server:~# uname -a
Linux storage-server 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64 GNU/Linux

root@storage-server:~# apt show libc6
Package: libc6
Version: 2.24-11+deb9u4

root@storage-server:~# apt show systemd
Package: systemd
Version: 232-25+deb9u9

I just did a new test on 1.14.1.5488. Based on our prior conversation, I disabled all periodic library scans. Maintenance should only occur between 2-5 AM and I just ran the test at 9:30 AM. I closed my iOS client, restarted PMS, and then loaded my Movie and Television libraries. I then downloaded the logs. Do you see anything abnormal in the logs?Plex Media Server Logs_2019-03-26_09-29-39.zip (3.0 MB)

While watching arcstat, the high IO activity (200K/sec) happened simultaneously with this PMS log output. What exactly is PMS doing?

Mar 26, 2019 09:40:18.244 [0x7f846ebfe700] DEBUG - Auth: authenticated user 1 as jasonwc
Mar 26, 2019 09:40:18.245 [0x7f84533ff700] DEBUG - Request: [::ffff:192.168.2.104:60768 (WAN)] GET /library/sections/1/recentlyAdded?unwatched=1 (19 live) TLS Signed-in Token (jasonwc)
Mar 26, 2019 09:40:18.257 [0x7f846f3ff700] DEBUG - Auth: authenticated user 1 as jasonwc
Mar 26, 2019 09:40:18.257 [0x7f84543ff700] DEBUG - Request: [::ffff:192.168.2.104:60770 (WAN)] GET /library/sections/1/onDeck (19 live) TLS Signed-in Token (jasonwc)
Mar 26, 2019 09:40:18.268 [0x7f846ebfe700] DEBUG - Auth: authenticated user 1 as jasonwc
Mar 26, 2019 09:40:18.268 [0x7f844c7ff700] DEBUG - Request: [::ffff:192.168.2.104:60772 (WAN)] GET /library/sections/1/arts?sort=random (19 live) TLS Page 0-49 Signed-in Token (jasonwc)
Mar 26, 2019 09:40:18.285 [0x7f846f3ff700] DEBUG - Auth: authenticated user 1 as jasonwc
Mar 26, 2019 09:40:18.286 [0x7f84597fe700] DEBUG - Request: [::ffff:192.168.2.104:60776 (WAN)] GET /library/sections/1/onDeck (19 live) TLS Signed-in Token (jasonwc)
Mar 26, 2019 09:40:18.292 [0x7f846f3ff700] DEBUG - Auth: authenticated user 1 as jasonwc
Mar 26, 2019 09:40:18.292 [0x7f84533ff700] DEBUG - Request: [::ffff:192.168.2.104:60774 (WAN)] GET /library/sections/1/recentlyAdded?unwatched=1 (19 live) TLS Signed-in Token (jasonwc)
Mar 26, 2019 09:40:18.295 [0x7f846ebfe700] DEBUG - Completed: [::ffff:192.168.2.104:60768] 206 GET /library/sections/1/recentlyAdded?unwatched=1 (19 live) TLS 50ms 133959 bytes (pipelined: 1) (range: bytes=0 -)
Mar 26, 2019 09:40:18.322 [0x7f84533ff700] DEBUG - It took 120.000000 ms to retrieve 97 items.
Mar 26, 2019 09:40:18.324 [0x7f846f3ff700] DEBUG - Auth: authenticated user 1 as jasonwc
Mar 26, 2019 09:40:18.325 [0x7f84523ff700] DEBUG - Request: [::ffff:192.168.2.104:60768 (WAN)] GET /library/sections/1/arts?sort=random (19 live) TLS Page 0-49 Signed-in Token (jasonwc)
Mar 26, 2019 09:40:18.340 [0x7f846ebfe700] DEBUG - Completed: [::ffff:192.168.2.104:60774] 206 GET /library/sections/1/recentlyAdded?unwatched=1 (19 live) TLS 48ms 133959 bytes (pipelined: 1) (range: bytes=0 -)
Mar 26, 2019 09:40:18.361 [0x7f844c7ff700] DEBUG - Setting container serialization range to [0, 49] (total=-1)
Mar 26, 2019 09:40:18.363 [0x7f846ebfe700] DEBUG - Completed: [::ffff:192.168.2.104:60772] 206 GET /library/sections/1/arts?sort=random (19 live) TLS Page 0-49 94ms 10178 bytes (pipelined: 1) (range: bytes=0 -)
Mar 26, 2019 09:40:18.408 [0x7f84523ff700] DEBUG - Setting container serialization range to [0, 49] (total=-1)
Mar 26, 2019 09:40:18.410 [0x7f846ebfe700] DEBUG - Completed: [::ffff:192.168.2.104:60768] 206 GET /library/sections/1/arts?sort=random (19 live) TLS Page 0-49 85ms 10247 bytes (pipelined: 2) (range: bytes=0 -)
Mar 26, 2019 09:40:21.044 [0x7f84543ff700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.044 [0x7f84597fe700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.044 [0x7f84543ff700] DEBUG - Selecting best audio stream for part ID 47091 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.044 [0x7f84597fe700] DEBUG - Selecting best audio stream for part ID 47091 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.044 [0x7f84543ff700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.044 [0x7f84597fe700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.044 [0x7f84543ff700] DEBUG - Audio Stream: 190136, Subtitle Stream: -1
Mar 26, 2019 09:40:21.044 [0x7f84597fe700] DEBUG - Audio Stream: 190136, Subtitle Stream: -1
Mar 26, 2019 09:40:21.045 [0x7f84597fe700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.045 [0x7f84597fe700] DEBUG - Selecting best audio stream for part ID 46571 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.045 [0x7f84597fe700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.045 [0x7f84597fe700] DEBUG - Found a suitable native subtitle language [eng] for a foreign film
Mar 26, 2019 09:40:21.045 [0x7f84543ff700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.046 [0x7f84597fe700] DEBUG - Audio Stream: 188507, Subtitle Stream: 188508
Mar 26, 2019 09:40:21.046 [0x7f84543ff700] DEBUG - Selecting best audio stream for part ID 46571 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.046 [0x7f84543ff700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.046 [0x7f84543ff700] DEBUG - Found a suitable native subtitle language [eng] for a foreign film
Mar 26, 2019 09:40:21.046 [0x7f84543ff700] DEBUG - Audio Stream: 188507, Subtitle Stream: 188508
Mar 26, 2019 09:40:21.046 [0x7f84597fe700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.046 [0x7f84597fe700] DEBUG - Selecting best audio stream for part ID 46601 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.046 [0x7f84597fe700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.046 [0x7f84597fe700] DEBUG - Audio Stream: 189056, Subtitle Stream: -1
Mar 26, 2019 09:40:21.046 [0x7f84543ff700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.046 [0x7f84543ff700] DEBUG - Selecting best audio stream for part ID 46601 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.046 [0x7f84543ff700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.046 [0x7f84543ff700] DEBUG - Audio Stream: 189056, Subtitle Stream: -1
Mar 26, 2019 09:40:21.047 [0x7f84597fe700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.047 [0x7f84597fe700] DEBUG - Selecting best audio stream for part ID 46226 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.047 [0x7f84597fe700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.047 [0x7f84597fe700] DEBUG - Audio Stream: 187553, Subtitle Stream: -1
Mar 26, 2019 09:40:21.047 [0x7f84543ff700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.047 [0x7f84543ff700] DEBUG - Selecting best audio stream for part ID 46226 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.047 [0x7f84543ff700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.047 [0x7f84543ff700] DEBUG - Audio Stream: 187553, Subtitle Stream: -1
Mar 26, 2019 09:40:21.048 [0x7f84597fe700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.048 [0x7f84597fe700] DEBUG - Selecting best audio stream for part ID 45828 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.048 [0x7f84597fe700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.048 [0x7f84597fe700] DEBUG - Audio Stream: 186299, Subtitle Stream: -1
Mar 26, 2019 09:40:21.048 [0x7f84543ff700] DEBUG - We’re going to try to auto-select an audio stream for account 1.
Mar 26, 2019 09:40:21.048 [0x7f84543ff700] DEBUG - Selecting best audio stream for part ID 45828 (autoselect: 1 language: en)
Mar 26, 2019 09:40:21.049 [0x7f84543ff700] DEBUG - We’re going to try to auto-select a subtitle.
Mar 26, 2019 09:40:21.049 [0x7f84543ff700] DEBUG - Audio Stream: 186299, Subtitle Stream: -1
Mar 26, 2019 09:40:21.049 [0x7f846f3ff700] DEBUG - Completed: [::ffff:192.168.2.104:60776] 206 GET /library/sections/1/onDeck (19 live) TLS 2763ms 27206 bytes (pipelined: 1) (range: bytes=0-)
Mar 26, 2019 09:40:21.050 [0x7f846ebfe700] DEBUG - Completed: [::ffff:192.168.2.104:60770] 206 GET /library/sections/1/onDeck (19 live) TLS 2792ms 27206 bytes (pipelined: 1) (range: bytes=0-)

192.168.2.104 is one of my clients, but it’s currently idle.

Thanks to some helpful advice from a user on #plex, I discovered that the reason the repaired database was read-only was that I ran the repair commands as root. I needed to chown plex:plex the resulting database. I have now repaired the database and restarted Plex with the new database. Again, no errors were detected. (I wouldn’t expect any errors given that ZFS is always consistent on disk and the system is backed by a UPS that can provide 30+ minutes of power).

While likely unrelated to my original issue I discovered that by making two changes two changes, I could dramatically increase the IO activity associated with Plex (reduced data reads by nearly an order of magnitude from 6200/sec to 700/sec :

  1. Reduce periodic library scan frequency to once per day (Note: Plex still scans automatically when changes are detected, doing a partial library scan)

  2. I turned off the 192.168.2.104 (OpenPHT) client that appeared to be generating hundreds of thousands of ARC hits/sec when updating, despite being totally idle.

Before (Data Hits):

After (Data Hits):

Before (Metadata Hits):

After (Metadata Hits:):

The metadata graphs continue to show peaks every 15 minutes, which is caused by zfs-auto-snapshot creating and destroying snapshots.