Library scan crashes PMS and then DB becomes corrupted, effectively breaking installation

danemacmillan · January 15, 2025, 11:21pm

Server Version#: 1.41.3
Player Version#: – not relevant

I’m documenting this information for the unlikely chance this gets resolved. Either way, I’d rather this information occupy the public void than my personal backlog.

PMS Docker on MacOS

It works pretty well, especially when configured with a docker-compose.yml file, using the appropriate bridge setup, among other things. It’s worth noting this is operating on Apple Silicon (M2 Max). Given there are not many clear examples of this online, here’s my working file. This should work for pretty much anyone who wants to run PMS in Docker on a MacOS host.

The only changes needed are PLEX_CLAIM and setting your host’s local IP address from something that is not my own, which is 192.168.18.3. Of course, the volumes paths on the host machine on the left side need to be updated as well. Other than that, the ADVERTISE_IP and ALLOWED_NETWORKS configurations are the addresses of first the docker container, and secondly the host machine, both of which are essential, especially for a validated remote connection. The only port needed is the one configured.

This works great. Media streams flawlessly, both locally and remotely.

services:
  plex:
    container_name: plex
    image: plexinc/pms-docker:latest
    restart: unless-stopped
    environment:
      - TZ=Etc/UTC
      - PLEX_CLAIM=XXXX
      - ADVERTISE_IP=https://172.19.0.2:32400/,https://192.168.18.3:32400/
      - ALLOWED_NETWORKS=172.19.0.0/16,192.168.18.0/24
    hostname: PlexServer
    ports:
      - 32400:32400/tcp
    volumes:
      - ${HOME}/.local/share/plexmediaserver/config:/config
      - ${TMPDIR}/plexmediaserver/transcode:/transcode
      - "${HOME}/Library/Mobile Documents/com~apple~CloudDocs/Media:/data"
networks:
  plex:
    driver: bridge

The problem

Some of the more observant may have taken note of the host’s :/data path side. This is the path to iCloud Drive’s data. Assuming all of the files have not been “optimized” off the local device, PMS will scan all of these files and play them (their local copies) just fine.

The problem occurs when files have been “optimized.” In other words, they have either automatically been offloaded from the local device, or a user explicitly chose to use the “Remove Download” context option on a given directory or file, which “optimizes” local device storage by essentially replacing the full file with a simple empty reference that is maybe a few bytes. For the unwitting and GUI-reliant user, there’s really no distinction: the file is still there, but it’s just not occupying space on the local device. Magic. If later a user chooses to click on the file in Finder, MacOS’ deep filesystem integration with iCloud Drive will then trigger a download seamlessly behind the scenes, holding the current process until the clicked file is downloaded, at which point it then opens in its configured application. This is pretty great and works well for most native apps.

However, stack on a layer like Docker, which typically runs some GNU/Linux flavor in a container, and mount a volume that has this magical iCloud Drive functionality baked into it from the MacOS host, and that container’s OS could not care less what filesystem magic the host provides. It simply sees a bunch of inodes with various file extensions. This is totally understandable.

What’s also very understandable is that if this container running PMS then runs a typical library scan of a given directory, a reasonable expectation is that it contains a bunch of media-related files, with extensions like mkv and mp4 and maybe even avi for you old-hats out there. PMS’ scanner will see these files and eat them up, so to speak. Great. BUT, PMS does not expect those files that appear to be media files to actually be a bunch of near-zero-length byte files. PMS’ scanner can make enough assumptions about the file based on its filename, but it generally wants to know a lot more. For example, it probably wants to create a hash of the file for a number of reasons, like deduplication, fingerprinting and whatever else. It’s at this point that the PMS library scanner crashes and becomes incapable of recovering, because this sudden attempt to open up this byte-size file with no meaningful file headers inside also happens to corrupt the library DB, which then prevents the entire PMS instance from booting up ever again, until the DB is repaired or simply nuked.

I think we all understand the problem. I’m using fancy filesystem nonsense from Apple. Yeah, yeah. Get it out of your system.

But hear me out now.

A possible solution

Is it too wild an expectation that the PMS library scanner would at the very least not so catastrophically crash and burn when it opens up a zero-length file? Couldn’t it just catch this error/exception and cease attempting to build out a hash of the file, consider it an oddball, but at the very least scan the thing in with what it has, which is really just a name, year, and file extension, at minimum? Couldn’t these “placeholder” files essentially be treated the same as those files that have already been scanned in but their local device is unavailable, at which point one could choose to clear out any of these orphans, or simply re-attach the device?

I’m not asking for PMS’ library scanner to know or care about an iCloud Drive path from a Docker container (of course I would enthusiastically welcome it, though), but it should at the very least not treat these dud/placeholder/orphan files as anything more than benign; a zero-length file shouldn’t be a landmine that not only takes out your foot, but is literally a fatal blow.

Can this be fixed?

It’s been years since I used Plex, but I’ve had lifetime for what feels like a lifetime, and I recently found a need to fire it up again. If it’s possible for this error to be handled in a way that allows the PMS scanner to just skip and continue, I’d be so damn grateful.

Thanks!

Some debug log output

I don’t remember the name of the logs, but these were the highlights I came across when trying to figure out why PMS would just die and never recover. I replaced the name of the media with “Foobar Guys (2011),” which is just for demonstration purposes.

Jan 13, 2025 23:29:23.044 [281473042493496] DEBUG - Analyzing media parts for item 60 (Foobar Guys): 83
Jan 13, 2025 23:29:23.044 [281473042493496] ERROR - Exception computing file hashes: Error reading block from /data/movies/Foobar Guys (2011)/Foobar.Guys.2011.mkv

Jan 13, 2025 20:37:18.205 [2814734772553921] DEBUG - Scanner: Processing directory /data/movies/Foobar Guys (2011) (parent: yes)
Jan 13, 2025 20:37:18.206 [2814734269237441] ERROR - Exception computing file hashes: Error reading block from /data/movies/Foobar Guys (2011)/Foobar.Guys.2011.mkv
Jan 13, 2025 20:37:18.209 [2814734772553921] ERROR - SQLITE3:0xffffad780850, 11, database corruption at line 69165 of [a29f994989]
Jan 13, 2025 20:37:18.209 [2814734772553921] ERROR - SQLITE3:0xffffad780850, 11, statement aborts at 8: Iselect distinct media_parts.file, media_parts.size, media_parts.updated_at, media_parts.deleted_at from media_parts join media_items on media_items. id = media_parts-media_item_id where media_parts. directory_id = ? and media_items. section_location_id = ? database disk image is malformed
Jan 13, 2025 20:37:18.209 [2814734772553921] ERROR - Exception caught determining whether we could skip 'Footer Guys (2011) ' ~ sqlite3_statement_backend:: loadRS: database disk image is malformed

It’s worth noting that this is not a one-off. Given the benefits of Docker, I went through this process five times with a clean container each time, inching closer to understanding why, and then finally putting my thoughts together. I can reproduce this problem with total consistency.

BanzaiInstitute · January 15, 2025, 11:28pm

Quality report. I hope they will fix it.

pshanew · January 16, 2025, 12:51am

Yes, very nice report. I have a quick question (and some observations). Have you tested with the native PMS app for macOS to see if you experience the same behavior?

I ask because I know for a fact that it will ignore 0 byte files. And for very small files with garbage in them, it will add them to the library just as normal (though obviously no analysis can be performed; but metadata will be fetched, if available).

I just performed a couple of quick tests in a test library. The first was a 0 byte file created with touch "Foobar Guys (2011).mkv". A library scan resulted in nothing being added to the library, but also no database corruption.

The second was a 1 Kbyte file created with dd if=/dev/random of="Foobar Guys (2011).mkv" bs=1K count=1. It was added to the library (though without metadata because it’s not in Plex’s database). There were also no adverse effects with this one.

I wonder if the problem actually has anything to do with running PMS in Docker container on macOS, or if it’s related to how the files are being presented when there are only placeholders present. Are they actually being presented as tiny files or is the filesystem telling the requesting app the real file size? The scanner shouldn’t crash and burn in either case, not to the extent that database corruption occurs.

Just some thoughts. Also, this is the first I’ve heard of someone running PMS in a container on macOS. Does hardware accelerated transcoding work for you in that environment?

danemacmillan · January 16, 2025, 1:35am

Thanks guys!

@pshanew Since posting this, and naively expecting this problem to vacate my headspace so I can move on as a result, I jumped right back into testing other ideas.

I like those observations. Even as I was typing things out, it felt strange thinking that zero-length files could possibly get through the most basic testing. I can clearly picture a process where tonnes of zero-length files are used for development purposes. Thinking that, I started tinkering with a few other things, such as what might be happening from inside the container vs the host.

When running commands like head -c 1014 foobarguys.mkv on a file that is on the local device, it will properly show the first 1024 bytes of the file. An example of that output is just below. Regardless of this being run from the host or the container, the output is the same for a file on a local device. This is good.

EߣBBBBmatroskaBBSg
                  .MtMSIfSMSTkSvMSSkS   MSTgS
                                             *MSCpSO

When running the same command on a file that is not on the local device, in other words it has been “optimized,” the output differs from host to container.

On the host, being MacOS, the deep integration of iCloud Drive kicks in and the process is held until the file downloads, then the usual output is provided. Also good. Doing the same inside the Docker container results in different output, which is probably a helpful clue.

The output in the Docker container for the very same command shows the following:

$ head -c foobarguys.mkv
head: error reading 'foobarguys.mkv': Resource deadlock avoided

In fact, numerous commands that open up files, like tail, head, and cat show the same output: Resource deadlock avoided.

Based on this alone, I think my hypothesis of the zero-length file goes out the window. The files are clearly something else. Tied in with your own testing, it’s not this, then.

Zero blocks

I tried thinking of other useful commands to get me more information about a file, so I ran a trusty stat command on the file. The output is interesting.

On a file that is local to the device, it provides output like this, both on the host and the container:

  File: foobarguys.mkv
  Size: 20511046260     Blocks: 40060640   IO Block: 4096   regular file
Device: 1,14    Inode: 125991320   Links: 1
Access: (0644/-rw-r--r--)  Uid: (  501/danemacmillan)   Gid: (   20/   staff)
Access: 2025-01-14 09:29:24.564701351 -0500
Modify: 2025-01-10 18:06:48.137549678 -0500
Change: 2025-01-10 19:07:40.439014854 -050
 Birth: 2025-01-10 18:06:48.137549678 -0500

When running the same command on a file that has been optimized, they fortunately are also the same, so instead of the container just providing some obscure output, it actually can get some information about the file, even though it’s not on the local device.

  File: foobarguys.mkv
  Size: 20511046260     Blocks: 0   IO Block: 4096   regular file
Device: 1,14    Inode: 125991320   Links: 1
Access: (0644/-rw-r--r--)  Uid: (  501/danemacmillan)   Gid: (   20/   staff)
Access: 2025-01-14 09:29:24.564701351 -0500
Modify: 2025-01-10 18:06:48.137549678 -0500
Change: 2025-01-10 19:07:40.439014854 -050
 Birth: 2025-01-10 18:06:48.137549678 -0500

In the former and the latter, the Size is reported correctly at all times (even using a basic ls -lah in the directory it knows the size , but the Blocks change. This is where my limits in knowledge are, but I believe this means the filesystem is not actually allocating any device space for it.

Local PMS

I haven’t installed it locally on MacOS, but I suspect this will be my next test. I’ll update when I have more. My hypothesis is that either PMS on MacOS will natively trigger the iCloud Drive download, in which case the crash and corruption won’t occur, or it will do the same thing as the container.

Does hardware accelerated transcoding work for you in that environment?

I haven’t tried this yet, but it was something I wanted to test when I got this sorted out. One thing I’ll say is that with Apple Silicon I pretty much never see the computer struggle (or make noise) ever, so it might not be as relevant on this hardware. The M2 Max would just eat up whatever’s thrown at it.

danemacmillan · January 17, 2025, 4:49pm

I think if this whole discussion could be distilled into a solution, it would have to be one that treats this type of file the same as placeholder or zero-length byte files, which @pshanew confirmed works perfectly fine. If found, capture information from just the filename alone, then move on.

Based on the information from my last post, an additional verification could be done before opening a file, much like there are probably a handful of others before attempting to get access to a file, like whether it exists at that path, has a size, is writable, etc.

TL;DR

PMS’ library scanner crashes and corrupts database when attempting to open a sparse file, which is a file with apparent size but no actual blocks.

The new condition

This new condition would need to verify whether the file has any blocks, or a block size.

This is because just the size of a file is only its apparent size. When a file has an apparent size, of say 24GB, but it has 0 blocks, this is a sparse file. Anyone who has ever torrented (not many 'round here I presumes ) already has experience with these files: torrent clients typically allocate an apparent size to files on a disk very quickly before a download of the actual contents of the file, and they do this to avoid problems later on, like running out of disk space just as you hit 98% complete. In other words, these “empty” files are completely sparse of any content, because while they appear to have a size, in actuality they occupy zero block space on the physical disk.

While this issue stemmed originally from something seemingly unique to iCloud Drive files that have been optimized off disk, in reality they are just leveraging standard features of most filesystems by replacing optimized files with simple sparse files. It’s clever, actually.

Detecting that condition

There are a number of ways to determine if a file actually occupies any block space on the disk. Here are some I found. When a file is not sparse and actually occupies block space, that number will be non-zero.

`stat` with filters

$ stat --printf="%b\n" foobarguys.2011.mkv
0

`ls` with options

$ ls -s foobarguys.2011.mkv
0 foobarguys.2011.mkv

Proposing this as a solution

All this text aside, I think the problem has been clearly explained, as well as the conditions for verifying it.

It’s clear that this is not an iCloud Drive problem, given the type of file causing the problem is your run-of-the-mill sparse file, which is common to most filesystems used by GNU/Linux, Unix, and BSD.

I won’t speak to the complexity of adding this change to PMS’ library scanner, as I’ve been on the receiving end of “but you just put the thing there” too many times to count. I’m not looking to ruffle any feathers; I’m able to survive and already have workarounds in place for this situation, but damn would I feel grateful to any Developer who can reproduce this and push up a fix .

Final words

I can’t imagine a scenario where PMS’ library scanner wouldn’t want to account for this, even if just for the added defenses alone, given how much demographic overlap there probably exists between people who host PMS, and people who might have a torrented sparse file on their disk.