DBRepair development

anon5074910 · January 15, 2024, 7:17am

Yeap, thanks.

ChuckPa · January 15, 2024, 8:37pm

ALL:

Forum Preview: DBRepair.sh 01.03.00

Adds new menu option “Purge” (or “remove”)
– Remove image files from PhotoTrancode Cache which PMS has missed
– Cache age default = 30 days. Override available as DBREPAIR_CACHEAGE variable (below)

Introduces Environment variable support (customizing) DBRepair operation.

DBREPAIR_PAGESIZE=N (where N is a multiple of 1024, less equal 65536)
most beneficial on ZFS filesystems to allow tuning to the Dataset size.
DBREPAIR_CACHEAGE=N (where N is the max age in Days) to retain when purging older image files in the Cache/PhotoTranscoder directory.
(PMS does well with maintaining this but some older systems might need assist)

Please let me know how well this behaves before I release into the wild ?

This capability will let us add other customizations as needed.

Thanks,
Chuck

PlexDBRepair-1.03.00-Forum-Preview.tar (140 KB)

ChuckPa · January 17, 2024, 5:54am

PlexDBRepair v1.03.00 has been released.

Please see the updated documentation in the README.md

djfriday13 · January 17, 2024, 3:42pm

was this suppose to run during the automatic script and default to yes prune? I ran the automatic script and it didn’t seem to do any pruning until i ran the prune option.

also i had 687 files that were pruned.

djfriday13 · January 17, 2024, 4:15pm

also just to make sure as i am not aware of what happens here, if you run the prune command and back out of the session it will continue to run until finished correct? i have seen on reddit and here on these forms where it takes days even a week or more for the files to actually delete on synologies at least.

dzm06 · January 17, 2024, 4:45pm

The prune command, under the hood, directly runs a Linux find command to file the older files and immediately remove them from the file system. The purged cache files will be gone before the script finishes executing.

The actual command that is executed is:

find "$TransCacheDir" $ -name \*.jpg -o -name \*.jpeg -o -name \*.png $ -mtime +${CacheAge} -exec rm -f {} \;

What this is doing is:

Run the find command
With the target directory being whatever path was specified or found earlier in the script
Looking for files that contains the extensions .jpg, .jpeg, or .png
That have been modified greater than $CacheAge days ago (default value is 30 days, but you can specify the age)
And with any that are found immediately execute the rm -f command against them (rm being the *NIX command for remove and the -f switch being an instruction that says “force it, don’t ask”)

If you want to see the full logic of what’s happening the checkin diffs are visible in Git at:

Tony_T · January 17, 2024, 5:14pm

More specifically,:
-f, --force is : ignore nonexistent files and arguments, never prompt
-i would prompt before removal, default is not to prompt.

djfriday13 · January 17, 2024, 5:18pm

what i was trying to address is that even running that rm -f command it may take days or weeks for people with Syno boxes (at least) to actually delete that data.

if this is ran as part of the “auto” then i’d assume plex would remain off until all files are deleted unless it is restarted before running the pruning.

also, if you are running manual once you run the prune command it may sit there for days seemingly unresponsive even though it is actually working. Just pointing it out given what i have seen, in case any changes need to be made given those circumstances.

ChuckPa · January 17, 2024, 5:57pm

@djfriday13

I’m deliberately doing this One-at-a-time with the remove so you can more easily interrupt it.

If you have 500,000 files to prune, and you say ‘yes’ . It will sit there, doing the work immediately, UNMOVING and UNRESPONSIVE (Unless you hit Control-C), until it’s complete.

It’s a little harder on the CPU but it’s all gated by the find command and won’t build up a backlog that has to flush out to disk after completion.

Here you see my test case of deleting 98,349 files

After 45 seconds, I interrupt it (Control-C)

It returns to the command line prompt immediately

I then start it again to continue until done, timing how long it takes

Yes, that first ‘prune’ might take you time but once done, it’ll be fine.
This is no different than that first optimization of the DB, true?

bash-4.4# tar xf qa-tv.tar.gz 
bash-4.4# ls -la
total 111148
drwx------   3 root  root      4096 Jan 17 12:51 .
drwxr-xr-x   6 root  root      4096 Jan 17 12:50 ..
-rw-------   1 root  root 113766679 Jan 17 12:51 qa-tv.tar.gz
drwxr-xr-x 940 chuck 1000     36864 Oct 14  2022 tv
bash-4.4# find tv -print | wc -l
104215
bash-4.4# find tv -type f -print | wc -l
98349
bash-4.4# time find tv -type f -exec rm -f {} \;
^C

real	0m44.192s
user	0m0.730s
sys	0m4.300s
bash-4.4# find tv -type f -print | wc -l
73525
bash-4.4# time find tv -type f -exec rm -f {} \;

real	2m7.801s
user	0m1.860s
sys	0m12.140s
bash-4.4#

djfriday13 · January 17, 2024, 6:03pm

ya it has to do what it has to do i guess i was just thinking that maybe you need some info maybe in the readme or somewhere letting users no that on first run if you have that many files it may be awhile. i dont think anyone expects to hit a delete command and 7 days later it finishes ha! but people with these problems don’t figure it out until their nas starts running out of storage.

anyway, thanks for the tool update. It’ll be easier to send people to the tool to prune their files then explain to them about their transcode tmp folder issue.

ChuckPa · January 17, 2024, 6:07pm

I’ll add a few words of caution

WARNING: This might take extended time to complete the first you use it. Do not panic

And suggest 100,000 files per 2 minutes removal rate

#### Warning: Initial pruning might take longer than expected.
  Execution time, using a  Synology DS418 as benchmark, is approximately 100,000 image files per 2 minutes.

Tony_T · January 17, 2024, 7:13pm

85,000 files were deleted pretty fast on my rPi5

However, you may want to test utilizing xargs as it should speed things up.
This site gives some performance stats of find -exec vs find | xargs
(see 1.5 Performance here)

ChuckPa · January 17, 2024, 7:29pm

Thanks @Tony_T

That would be a nice improvement.

I’ll double check all the supported environments to confirm xargs is available in them.

EDIT:

Almost forgot about the best way.

[chuck@lizum pqa.2036]$ find . -print | wc -l
104219
[chuck@lizum pqa.2037]$ sync
[chuck@lizum pqa.2038]$ time find . -type f -delete

real	0m2.349s
user	0m0.065s
sys	0m1.216s
[chuck@lizum pqa.2039]$

DS418 test

bash-4.4# time tar cf plex-qa-tv.zip ./tv

real	1m19.395s
user	0m0.930s
sys	0m8.270s
bash-4.4# sync
bash-4.4# sync; echo 3 > /proc/sys/vm/drop_caches
bash-4.4# ls
plex-qa-tv.zip	tv
bash-4.4# time find ./tv -delete

real	0m10.528s
user	0m0.150s
sys	0m7.510s
bash-4.4#

Volts · January 17, 2024, 8:45pm

SQLite page size must be a power of two between 512 and 65536 inclusive. There’s a really clever way to check for powers of two.

bash - check if numbers from command line are powers of 2 - Unix & Linux Stack Exchange

(I had comments about find blah -delete but y’all got there before I hit “post”. -delete is safer than xargs too.)

ChuckPa · January 17, 2024, 8:53pm

Yes, I could have been clever about the power of two check and rounding but, being ‘/bin/sh’, I had to resort to ‘expr’ so I was conservative.

It also states between 512 and 65536.

The OS default is 4096. Why someone would want anything below that makes no sense so I went with 1024 multiples.

Volts · January 17, 2024, 8:55pm

Small typo in the example usage:

export DBREPAIR_PAGESIZE=65534

IMO 65536 is unlikely to be beneficial. You see a huge increase going from 1024 (plex default) to 4096 (SQLite default). You see further increase going to 8192 to 16384. I doubt anybody will see material benefit going from 16384 to 32768 or 65536. But I’d love to see benchmarks!

Actually, wait. There’s only a few valid values. Probably easier just to check if the value is 1024 4096 8192 16384 32768 65536, right?

Volts · January 17, 2024, 8:56pm

My real point is that it’s not “multiples of 1024”. 40960 is a multiple of 1024, but it’s not a valid value.

Plex’s default is 1024, which is crazy to me. I’ve been in bickering fights with devs about how much better 4096 - the SQLite default - performs.

ChuckPa · January 17, 2024, 8:56pm

ZFS is where it benefits. EXT and XFS filesystems see no benefit
(See README.md)

ChuckPa · January 17, 2024, 8:57pm

I’ll fix the wording “Power of two”

LOL

You love making me rewrite stuff, dont you !

Volts · January 17, 2024, 9:02pm

I saw a significant drop in performance when pagesize exceeded a certain size. 4096 was an undeniable win. 8192 and 16384 were better only with tons of cache available to SQLite. 32768 and 65536 were worse.

That was all on a filesystem with the default ZFS recordsize of 128k.

Topic		Replies	Views
PMS db corrupted - running in BINHEX docker on Unraid Metadata & Adding Files server-unraid	63	470	June 18, 2023
Support Doc for Database Cache Size (MB)? General Discussions server-linux	220	12203	December 29, 2023
Recovering from failed Blobs DB Plex Media Server server-linux	25	509	October 18, 2024
Plex Media Server keeps crashing Plex Media Server server-linux	25	239	November 26, 2023
Would it be possible to get some help with -[Req#a34/Database optimization/com.plexapp.plugins.libra Remote Access server-synology , media-optimizer	47	423	July 12, 2023

DBRepair development

ALL:

PlexDBRepair v1.03.00 has been released.

EDIT:

Related topics