Please don't delete library after file system read errors

My Plex server runs on a cheap PC, while my library is on a NAS.

Sometimes, the scanning seems to encounter errors. Unfortunately Plex then deletes the remainder of the library.

It should abort the scan instead.

 

Log extract:

Apr 14, 2014 00:52:21:547 [5884] DEBUG - Marking item 6511 as alive and well.
Apr 14, 2014 00:52:21:672 [5884] DEBUG - HTTP requesting to: http://127.0.0.1:32400/:/metadata/notify/toggleItemActivity?librarySectionID=1&metadataItemID=6511
Apr 14, 2014 00:52:21:688 [5884] DEBUG - Updating deletion state for metadata item 6505, is has a dead item count of 15.
Apr 14, 2014 00:52:21:688 [5884] DEBUG - Updating deletion state for metadata item 6504, is has a dead item count of 4.
Apr 14, 2014 00:52:21:844 [5884] DEBUG -     * Scanning directory Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 15 (parent: yes)
Apr 14, 2014 00:52:21:906 [5884] DEBUG - Adding file for scanner: Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 15\c.S02E15.VOSTFR.HDTV.XviD-0TV.avi
Apr 14, 2014 00:52:21:922 [5884] DEBUG - File 'Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 15\c.S02E15.VOSTFR.HDTV.XviD-0TV.avi' was marked as deleted, can't skip.
Apr 14, 2014 00:52:22:046 [5884] DEBUG -       * Scanning c Season 2 Episode 15
Apr 14, 2014 00:52:22:046 [5884] DEBUG - Looking for path match for [Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 15\c.S02E15.VOSTFR.HDTV.XviD-0TV.avi]
Apr 14, 2014 00:52:22:046 [5884] DEBUG - Path matched, we're reusing media item 6097
Apr 14, 2014 00:52:22:140 [5884] DEBUG - Bringing back media item 6097 to life.
Apr 14, 2014 00:52:22:249 [5884] DEBUG - Updating deletion state for metadata item 6512, is has a dead item count of 0.
Apr 14, 2014 00:52:22:249 [5884] DEBUG - Marking item 6512 as alive and well.
Apr 14, 2014 00:52:22:514 [5884] DEBUG - HTTP requesting to: http://127.0.0.1:32400/:/metadata/notify/toggleItemActivity?librarySectionID=1&metadataItemID=6512
Apr 14, 2014 00:52:22:530 [5884] DEBUG - Updating deletion state for metadata item 6505, is has a dead item count of 14.
Apr 14, 2014 00:52:22:530 [5884] DEBUG - Updating deletion state for metadata item 6504, is has a dead item count of 4.
Apr 14, 2014 00:52:22:561 [5884] DEBUG -     * Scanning directory Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 16 (parent: yes)
Apr 14, 2014 00:52:41:188 [5884] WARN - Error scanning directory, we'll skip and continue: boost::filesystem::last_write_time: Das Zeitlimit für die Semaphore wurde erreicht: "Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 16"
Apr 14, 2014 00:52:41:188 [5884] DEBUG -     * Scanning directory Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 17 (parent: yes)
Apr 14, 2014 00:52:41:515 [5884] WARN - Error scanning directory, we'll skip and continue: boost::filesystem::last_write_time: Der Netzwerkpfad wurde nicht gefunden: "Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 17"
[snip more errors]
Apr 14, 2014 18:40:24:872 [5884] DEBUG -     * Scanning directory Z:\Serien	 (parent: yes)
Apr 14, 2014 18:40:24:872 [5884] WARN - Error scanning directory, we'll skip and continue: boost::filesystem::last_write_time: Der Netzwerkpfad wurde nicht gefunden: "Z:\Serien	"
Apr 14, 2014 18:40:25:699 [5884] DEBUG - Removing 391 media items that were left.
Apr 14, 2014 18:40:25:699 [5884] DEBUG - Soft-deleting media item 6098.
Apr 14, 2014 18:40:25:980 [5884] DEBUG - Soft-deleting media item 6099.
[soft deleting all the rest...]

 

In Preferences, under Library

Make sure that "Empty Trash after Scan" is not selected

/T

I did, but that just reduces the impact. It is still bad manners to delete the library if files could not be read due to file system errors, even if the files are just marked as deleted.

Library items should only be deleted if an actual "file not found" result is encountered.

It sounds like your Plex server can’t reach the files it’s trying to scan. In that case it has no way of telling whether the files have been deleted or they’re just temporarily unavailable. So the sane behaviour would be to assume the former, otherwise imagine how much of a pain in the ass it would be to actually delete files.


dane22’s recommendation is the best course of action here for your situation, otherwise you should look into why the files on your NAS are unreachable sometimes.

I think using UNC paths instead of mapped drives for your libraries may solve your problem.

Use \\YOURNAS\Serien instead of mapped Z:\ for the same path.

It sounds like your Plex server can't reach the files it's trying to scan. In that case it has no way of telling whether the files have been deleted or they're just temporarily unavailable. So the sane behaviour would be to assume the former, otherwise imagine how much of a pain in the ass it would be to actually delete files.

I'm sorry, but that is just wrong. I do quite some programming and it is ridiculously easy to know why the file was not available. If the file is not there, you get a "File not found" error, while in the case shown by my log files, the errors / exceptions are different. Even in the log file, it shows that the errors returned were different.

From a programmers point of view, the behaviour is everything but sane. Not all errors are the same!

All that would be needed are about two lines of code that say "in case of file system read errors: abort the scan, try again next time".

What I am understanding from this is that you are not happy about this action:

Apr 14, 2014 18:40:25:699 [5884] DEBUG - Removing 391 media items that were left.

Apr 14, 2014 18:40:25:699 [5884] DEBUG - Soft-deleting media item 6098.
Apr 14, 2014 18:40:25:980 [5884] DEBUG - Soft-deleting media item 6099.
[soft deleting all the rest...]

which followed what looked like a better course of action initially:

Apr 14, 2014 00:52:41:515 [5884] WARN - Error scanning directory, we'll skip and continue: boost::filesystem::last_write_time: Der Netzwerkpfad wurde nicht gefunden: "Z:\Serien\c\c Season 2 Complete\Sesong 2 Episode 17"
[snip more errors]

I believe you have a point and it would be better to just skip leaving all metadata in tact rather than removing media metadata.

It is not actually deleting your media library - all it appears to be removing is the metadata information within Plex Media Server - not a big deal when compared to the real catastrophic impact of disk write / read errors that would compromise the integrity of any held information.

This would put a lot of burden on the developers.  While you may think it is very easy to determine why a file is not available, bear in mind that you have to consider all the different operating systems and then interpret why the file is unavailable.  There are many very final and irrecoverable errors that will not produce "file not found" due to anything from filesystem errors to physical disk issues.

It seems dane22 had the best advice.  If you frequently get these errors, time might be better spent in understanding why your files are failing so frequently.

Thank you sa2000 :). You are of course right, it just deletes the metadata, and I set this to soft-delete so the information is recovered in the next working scan.

@agregjones: I do not think that it is very easy, I know it is easy, because the abstraction of the operating systems is conveniently provided by the programming language or its standard library.

Examples:

POSIX: Use access() or check for ENOENT

Java: Use File.exists() or check for FileNotFoundException

PHP: Use file_exists()

My point is: All other final and irrecoverable errors should be written to the log, and then the scan should be aborted completely, it should not be assumed that the remaining files were deleted.

Thank you sa2000 :). You are of course right, it just deletes the metadata, and I set this to soft-delete so the information is recovered in the next working scan.

@agregjones: I do not think that it is very easy, I know it is easy, because the abstraction of the operating systems is conveniently provided by the programming language or its standard library.

Examples:

POSIX: Use access() or check for ENOENT

Java: Use File.exists() or check for FileNotFoundException

PHP: Use file_exists()

My point is: All other final and irrecoverable errors should be written to the log, and then the scan should be aborted completely, it should not be assumed that the remaining files were deleted.

I am so glad I wasted 20 years managing development teams to be corrected so easily.  You can believe in that layer of file abstraction all you want, but the errors can differ wildly due to the way in which people access the data.  Many people do not house the content on the same machine as the PMS installation, meaning that you have to contend with all the possible exceptions due not only to a file system error but also the way in which they accessed the file (SMB, AFP, NFS) and whether the file is on a mounted volume or just using a reference.

So my suggestion is that you submit the code to handle the error processing yourself, proving just how simple it is.

I don't see the necessity of handling all different error codes. All I wish for is that the scan is aborted if any exception is encountered.

As per your suggestion, here's some Java pseudo-code, as I don't have access to the Plex source:

void updateDB(String[] directories) {
    List foundFiles = new ArrayList<>();
    try {
        for(String dir : directories) {
            foundFiles.addAll(getDirFiles(dir));
        }
        compareAndUpdateDatabase(foundFiles); //update Database only if file scan worked
    } catch(IOException e) {
        log.severe(e.getLocalizedMessage()); //log any exception but don't touch database
    }
}

If you test your own source on all the operating systems supported against all the file access methods (AFP, SMB, NFS, etc.) and show that the results are as intended, that will be  a start.

But what you seem not to be grasping is that "not there because the share is no longer available" is still as valid a reason to remove metadata as "the share is accessible but that file is not."  The current method seems to be working for many thousands of users.

that "not there because the share is no longer available" is still as valid a reason to remove metadata as "the share is accessible but that file is not."

I agree here

/T

[...]

But what you seem not to be grasping is that "not there because the share is no longer available" is still as valid a reason to remove metadata as "the share is accessible but that file is not."  The current method seems to be working for many thousands of users.

This is the point where I definitely do not agree. I configured all directories in Plex for a reason. If one of them is not available, I would very much prefer to be notified (in the log or in the interface) instead of having the metadata be deleted automatically.

early 2021 clean-up: implemented (by disabling automatic deletion of trash in combination with automatic/scheduled library scans)