Plex Server Cache over 310GB

Is it normal for the Cache folder within Plex to be over 310GB?

I have a fair chunk of media (roughly 52TB) so could understand if it was just never realized it could get so big, Also if that is normal I have it running on an SSD is there any advantage to me doing this or should I move it to a bigger standard drive?

That’s pretty normal for a big collection. Get a bigger SSD. There is a Butler job designed to cut down the size of the cache. Make sure it is run regularly

https://support.plex.tv/articles/201553286-scheduled-tasks/

2 Likes

@TheSt4gz I’ve been wondering the same thing as my cache grows at an insanely rapid rate.
Today I think I’ve found a possible reason. I am currently running md5deep on the cache directory and so far I am finding a TON of duplicate files with matching hashes…

Out of ~1 million cache files I’ve hashed ~500k so far. From that I’ve found 815 sets of duplicates. Most duplicates are less than 50 copies but I have a few with 55K+ copies. Looking at one of the larger images with duplicates I found that the video file was corrupt and causing the scanner job to throw an error. I’m guessing when the scanner throws the error it’s not cleaning up what was done and resulting in tons of useless copies.

Version 1.13.2.5154 (Linux)

Final Results:

Total Files In Cache: 959181
Unique Files: 25603
Duplicate Files: 933578

List of duplicate hashes:
http://paste.ubuntu.com/p/Wkkx7nfMxG/

Can you tell which subfolder below cache holds those duplicates?

Is it /Cache/Transcode/Sessions
or /Cache/PhotoTranscoder/
or something else?

/Cache/PhotoTranscoder/

What happens if you close/quit/stop/end plex server
wait a few seconds
restart Plex server
let it sit for a few minutes
?

Is the folder cleared out?

Do you have this butler job enabled:
Settings - Server - Scheduled Tasks - ā€˜Remove old cache files every week’
?

@OttoKerner

I restarted Plex earlier today when I had to grow the filesystem. Not clearing dupes.

ā€œRemove old cache files every weekā€ is on.

I did some further analysis and it looks like Plex is creating roughly 4000 duplicate files per day for each one of those 8 that are over 100K.

As mentioned before, my bet is on Plex starting the metadata/cache collection process for a particular movie/episode then erroring out. After erroring out the data that was gathered isn’t cleared out or logged anywhere in the database for Plex to re-use or clean later.

Can you compare the timestamps on those dupes and crossreference that with server log files? It’d be interesting to see what was done at the time and which media files are connected to the issue.

Hi @OttoKerner

I have reproduced the issue manually. Below are the steps on a Linux box with a description. I am using a movie that has never(and will never) been in my library - The Emoji Movie (2017).

Here are the steps to reproduce:

  1. mkdir "/PATH/TO/Movies/The Emoji Movie (2017)" #create directory for movie

  2. dd if=/dev/urandom of="/PATH/TO/Movies/The Emoji Movie (2017)/The Emoji Movie (2017).mkv" bs=1M count=1000 #create dummy movie file with no valid data(random)

  3. Scan for media in Plex or wait for scans to occur

  4. grep -i "Emoji" "/path/to/Plex Media Scanner*" #errors will appear for the movie about invalid data

  5. find /path/to/PhotoTranscoder -type f -mmin -10 -printf "%C+ - " -exec md5sum "{}" \; #list new files within the past 10 minutes.

From the steps above you should see something like what I received:

Result from step 4

Plex Media Scanner.log:Jul 03, 2018 20:10:28.695 [0x7f50917ff700] ERROR - Exception analyzing media file ā€˜/mnt/storage/Media/Movies/The Emoji Movie (2017)/The Emoji Movie (2017).mkv’ (Could not parse /mnt/storage/Media/Movies/The Emoji Movie (2017)/The Emoji Movie (2017).mkv (error=-1094995529): Invalid data found when processing input)
Plex Media Scanner.log:Jul 03, 2018 20:10:28.819 [0x7f50917ff700] ERROR - Exception extracting thumbnail from media file ā€˜/mnt/storage/Media/Movies/The Emoji Movie (2017)/The Emoji Movie (2017).mkv’ (Could not parse /mnt/storage/Media/Movies/The Emoji Movie (2017)/The Emoji Movie (2017).mkv (error=-1094995529): Invalid data found when processing input)
Plex Media Scanner.log:Jul 03, 2018 20:10:28.938 [0x7f50917ff700] ERROR - Exception extracting thumbnail from media file ā€˜/mnt/storage/Media/Movies/The Emoji Movie (2017)/The Emoji Movie (2017).mkv’ (Could not parse /mnt/storage/Media/Movies/The Emoji Movie (2017)/The Emoji Movie (2017).mkv (error=-1094995529): Invalid data found when processing input)

Result from step 5

2018-07-03+20:10:30.0106523180 - fca35fa4158c997f8ef50090c4cab220 PhotoTranscoder/7b/7b932733dcc143c88583d17c563d1adf4d08ec62.jpg
2018-07-03+20:10:29.8386523180 - 3c82442154c89ff490545f3adf3517ac PhotoTranscoder/60/60d83443f69dd327ba132e8f310ead01fd4ca9db.jpg
2018-07-03+20:10:37.5426523180 - efad4dd6c6f8c14e580afbe3a1a08c24 PhotoTranscoder/f1/f105120df951865eaee351cc66f52b8f1236d01a.jpg
2018-07-03+20:10:38.1306523180 - 65062e9a3283d5b476b1d70554e397f6 PhotoTranscoder/55/555319943d4f95d1a0cfa82ead39792887540d55.jpg
2018-07-03+20:10:40.6386523180 - ca020f7484b564229f1aae9543eadd7a PhotoTranscoder/32/32532a1ae046c492ddeedfd78fae79f5dfa2ec3e.jpg
2018-07-03+20:10:39.1546523180 - 3c82442154c89ff490545f3adf3517ac PhotoTranscoder/02/02a88460ae2a2efb721869f70a895956ddaaad30.jpg

You can see two files with the hash 3c82442154c89ff490545f3adf3517ac which are from that movie.

Hope this helps.

3 Likes

Thanks for performing this test!
May I have the full set of logs for analysis by the developers?
You can PM me.

2 Likes

@OttoKerner sent

Thank you very much for all the detailed diagnostics.

I too have reproduced the continuous growth of the cache
The issue is two-fold

  1. You have PlexKodiConnect repeatedly making requests for the metadata info over and over again - every few seconds.
  2. For a media file that has not been analyzed, everytime a Plex client requests it or requests metadata info, the server attempts to do an Media Analysis. This results in new thumbs being generated. If there is an issue with the file, then the analysis will not complete and will get re-analyzed later. In your case few seconds later when PlexKodiConnect is repeating the request

This is an extract showing the repeated requests from PlexKodiConnect for the test movie that you added and we know it fails to analyze

Jul 05, 2018 11:23:31.003 [0x7f85b17ff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (7 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:23:31.011 [0x7f85c9fff700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:23:36.998 [0x7f85b17ff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (9 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:23:37.006 [0x7f85b0ffe700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:23:42.975 [0x7f85bafff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:23:42.987 [0x7f85c9fff700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:23:49.042 [0x7f85c57fa700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:23:49.060 [0x7f85b0ffe700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:00.972 [0x7f85bbffe700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:00.981 [0x7f85b0ffe700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:07.035 [0x7f85bbffe700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:07.043 [0x7f85b07fd700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:13.001 [0x7f85b6fff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:13.009 [0x7f85b0ffe700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:18.979 [0x7f85bafff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:18.988 [0x7f85b07fd700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:24.932 [0x7f85b17ff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:24.943 [0x7f85afffc700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:31.025 [0x7f85bbffe700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:31.035 [0x7f85b07fd700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:37.015 [0x7f85c57fa700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (9 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:37.023 [0x7f85afffc700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:43.003 [0x7f85b9fff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (9 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:43.011 [0x7f85b07fd700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:48.955 [0x7f85b6fff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (9 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:48.976 [0x7f85afffc700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:25:01.010 [0x7f85b9fff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (9 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:25:01.028 [0x7f85afffc700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:25:12.911 [0x7f85b17ff700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (9 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:25:12.920 [0x7f85c9fff700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:25:18.996 [0x7f85c57fa700] DEBUG - Request: [192.168.99.110:51046 (Allowed Network)] GET /library/metadata/20983?includeRelated=0&includeExtras=1&includeReviews=1&checkFiles=0 (10 live) TLS Signed-in Token (kantlivelong)
Jul 05, 2018 11:25:19.004 [0x7f85bc7ff700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.

I think asking for same media item metadata every 5-7 seconds is not normal Plex client behavior - so i would suggest raising that as an issue with the PlexKodiConnect

I did some testing with a file that does analyze ok. New cache files get created when there is a request for a different size image or if the media file got re-analyzed. You can end up with some cache files being identical but there should not be many

To add to my earlier - you actually had two machines running PlexKodiConnect hammering the server - the earlier one I pasted was from IP 192.168.99.110. There were also repeated requests for the same file from 192.168.99.253

Jul 05, 2018 11:08:20.768 [0x7f85bbffe700] DEBUG - Request: [192.168.199.253:35926 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:21:41.140 [0x7f85c3ff7700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:21:42.037 [0x7f85bc7ff700] DEBUG - Request: [192.168.199.253:36302 (WAN)] GET /library/metadata/20983?skipRefresh=1 (7 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:23:31.011 [0x7f85c9fff700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:23:31.948 [0x7f85bbffe700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (12 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:23:37.006 [0x7f85b0ffe700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:23:37.851 [0x7f85b9fff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (13 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:23:42.987 [0x7f85c9fff700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:23:43.873 [0x7f85bbffe700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:23:49.060 [0x7f85b0ffe700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:23:50.921 [0x7f85b6fff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:00.981 [0x7f85b0ffe700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:01.843 [0x7f85b6fff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:07.043 [0x7f85b07fd700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:07.900 [0x7f85bc7ff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:13.009 [0x7f85b0ffe700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:13.877 [0x7f85b17ff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:18.988 [0x7f85b07fd700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:19.837 [0x7f85bc7ff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:24.943 [0x7f85afffc700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:25.812 [0x7f85b6fff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:31.035 [0x7f85b07fd700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:31.918 [0x7f85bc7ff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (14 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:37.023 [0x7f85afffc700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:37.871 [0x7f85b17ff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (13 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:43.011 [0x7f85b07fd700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:43.851 [0x7f85bbffe700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (13 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:24:48.976 [0x7f85afffc700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:24:50.885 [0x7f85bbffe700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (13 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:25:01.028 [0x7f85afffc700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:25:04.637 [0x7f85bafff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (11 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:25:12.920 [0x7f85c9fff700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:25:14.703 [0x7f85bafff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (11 live) TLS GZIP Signed-in Token (kantlivelong)
Jul 05, 2018 11:25:19.004 [0x7f85bc7ff700] DEBUG - Media Analyzer: Performing on-the-fly analysis on 1 item.
Jul 05, 2018 11:25:20.752 [0x7f85b9fff700] DEBUG - Request: [192.168.199.253:36366 (WAN)] GET /library/metadata/20983?skipRefresh=1 (11 live) TLS GZIP Signed-in Token (kantlivelong)

That’s is correct, usually only one request is needed. I’d like to see the Kodi debug log files for that…

I am guessing that may be it is because the returned xml does not have media stream information and the request is retried.

@sa2000., interesting on the .253 as that was my laptop running the Plex WebUI.

@croneter, did you want me to open a ticket for PKC?

PS:
I wrote this script to symlink the dupes and it has reduced my cache significantly.

If you can repeat the steps - this time with verbose logging as well as debug, we can then see more info for these requests

Showing as wan because different subnet

Yes please, and share your Kodi debug log as explained in the wiki on Github

PKC won’t ask twice if data is missing.

My guess:

  1. PMS scans metadata of this ā€œcorruptā€ video
  2. PMS tells anyone listening on websocket that scan is finished (and element thus potentially changed)
  3. PKC wants to grab new metadata for the corrupt video and requests metadata from the PMS (what you see in the PMS log)
  4. PMS sees that the item is corrupt and concludes that the video needs to be scanned again -> start again at 1.
  5. Continue for eternity

Would have to be fixed on the PMS side, then.