One of my drives fell and while some data is gone, I could recover most of my data successfully. However the directory structure and filenames were lost. I still have all of them indexed in Plex though, so I thought I could just iterate over the http://localhost:32400/library/metadata/<number>/tree?X-Plex-Token=redacted API and that has a file and a hash value which would be perfect for me.
However I couldn’t figure out how is this hash generated exactly. It looks like SHA1 by the size of it, but it’s probably not for the whole file. I only found this thread that mentions what it is, but if I just read the first 4096 bytes, I get a completely different hash, same like the guy in the original post.
If I could figure, that would help me a lot, I wouldn’t have to manually redo my entire library…
Some more research: I added some media to Plex, grabbed the hash and wrote a quick program that reads one byte at a time, then calculates the hash and there was seemingly no match… So it can’t be simply just taking SHA1…
Then to test the theory found in the other post where they claimed Plex only takes the first 4K block, so I used dd on the original file to produce a file with the same beginning, just cut off after 10 MB and the hash was different… So it can’t be just that. Because then the hash would be the same.
I did that when I wrote the post, but missed it out. Thanks for pointing that out. Unfortunately the hash changes every time.
I even went as far as to read 10 MB from the beginning and the end of the file and put it together, but unfortunately that also produced a different hash. So they must rely on something else too. I assume they hash the duration or the filesize (or both) or whatever as well.