Copying 18 gig metadata folder projected to take at least 1 day?

Server Version#:
Player Version#:

I am trying to move my data folder from C drive to an external drive, and after stopping the services the instructions tell me to copy the folder to the new location. It’s about 18 gigs, but the estimate for copying it fluctuates wildly, telling me it will take over a day because it’s literally transferring mere kilobytes of data at a time.

There is nothing wrong with either drive - copying other items take no time at all. But this operation takes seemingly forever, and choosing properties of the copied folder take ages to give the storage amount. Even then, it seems to take up 18 gigs of data before it even writes it as under property menu the “size” and “size on disk” amounts are wildly different.

It there something preventing a full copy speed here? I’m not about to spend 2 days copying a mere 18 gigs of data.

First, it sounds like you don’t understand how filesystems /copying works. I just moved a 286GB Plex metadata folder from one SSD array to another in around an hour, but that’s high performance SSDs, but it also fluctuated wildly, so lets look at what is in the metadata folder.

You’ve got a few large multi MB files, usually the plex .db files. these are probably 100+ MB.
You’ve got tiny xml files, these might be 100-500 bytes
and You’ve got images, .jpg files, around 20-50KB

If we use a car/highway analogy, imagine you have to drive 1 mile on a highway, but you have two choices. one is to get in a car and just drive the 1 Mile, the other is you’ve got 30 cars spaced evenly apart, but you have to get in one, drive a short distance, get out, get in the next car, drive a short distance, get out, etc. Which do you think will go faster?

Just because you have a drive that can read/write at 100+MB/s, doesn’t mean it will. Tiny files go slow, large files go fast. The more tiny files you have the slower it goes. If you do an rsync with --progress, you can see this. Large files will go into the MB/s speed, tiny files operate at single digit KB/s. The metadata folder has thousands of TINY files, so it slows down. When it gets to a folder with larger files, it speeds up. You have to perform 3 actions on every file: Open, Read, Close. The more files you hve the more operations you have to do. make sense?

I had to move over a billion files once between two NFS filers in our datacenter, and these files were less than 50KB each. Even over a 10Gbps network, after 6 months of 24x7 copying I wasn’t even half done. Tiny files will never transfer at 10Gbps, only large files will.

Hope that helps, considering you didn’t provide ANY information about your drives or system, my guess though is you have a regular spinning hard drive, which is goin to have terrible IOPS numbers.

2 Likes

Well, what about if you use a COPYING SOFTWARE that creates many threads at a time. Would it be faster?

@evanrich’s analogy of repeatedly getting in and out of the car is a good one - I’m going to steal it for myself. Starting and stopping the car for each passenger vs. driving on the highway.

The operating system is “careful” with files, which means the quantity of files is more important than the volume of bytes to be transferred.

There are some utilities that are faster and give better progress updates. I haven’t used TeraCopy in years but it’s always a popular choice. I have more experience with rsync but it’s not as Windows-friendly.

That’s pretty normal too. It can go both directions. You’re probably encountering the first:

  • Bookkeeping for files uses space. Bookkeeping for small files can use more space than the contents of the files. Some filesystems are better than others at minimizing this.

  • Compression can cause files to use less space on disk than their “apparent” size.

USB HDDs are notoriously slow at “zillions of files” operations, too. If you’re going from an internal SSD to a USB HDD, you might find that Plex performance gets worse. This sloooow copy behavior might be an indicator that you should revisit your plan.

HDDs that can transfer 100MB/s sequentially (big files) can often only do a few small operations a second.

Small USB3 SSDs are affordable and a great choice for the Plex metadata folders. They can do gobs of small-file operations every second.

You were also doing something wrong. :slight_smile:

No, because your drives can only do so many operations at once. would spinning up 2 threads work? probably, but you aren’t going to scale with number of copy processes, in fact you’ll end up slowing things down if you try too many.

Get a faster hard drive if you want the process to speed up… SSDs can do 200-1000x the IOPS of a hard drive.

1 Like

Wasnt me, this 200TB NFS box was literally a metadata warehouse. hundreds of millions, if not billions of 100-300 byte files. trying to copy between California, and Texas. Absolutely worst case scenario for storage. We gave up and just shipped the filer to the datacenter lol.

Reddit - Dive into anything

“Never underestimate the bandwidth of a station wagon full of magnetic tapes hurtling down the highway”.

:smiley:

Re: Size on disk vs size…again, sounds like you don’t understand basic filesystem concepts.

Your drive is divided up into sections, called sectors and blocks. Sectors will either usually be 512 Bytes, or 4096 Bytes on newer drives. Blocks will then be groups of these sectors. The file system then presents groups of blocks as it’s own sizes as well, in mutiples of 4096. If you have a 4000 byte file, it will nicely fit in one of these blocks. However if your file is 5000 bytes, it will span 2 of these blocks. You can’t “half-use” a block so it gets treated as wasted space, and your 5000 byte file takes up 8192 Bytes on disk, wasting 3192 bytes. That why File size / Size on disk vary wildly. There are upsides and downsides to using larger or smaller block size, larger size leads to faster access to large files, but wasted space for small files, smaller sizes help with small files, but suck for larger files because you have to read more. Windows by default will use a 32 or 64KB File system size, I forget what linux defaults to. This means your 5000 byte file will occupy 32KB or 64KB on disk. Make sense?

I am about to go from 4tb to 8tb drives in my Plex storage system of eight (8) drives. Can someone tell me about how long it will take to CLONE the drives using a stand-alone (off-line) WavLink or Unitek dock? I am anxious to start doing this operation, but hesitate because I don’t know how long it will take for each drive. Any help would be appreciated.

An alternative method which can be faster is to zip up the folder, copy over the zip file, unzip at the location.

1 Like

4 drives, 8 drives… of what. NVME? SSD? HDD? floppy disks? Raid 5? Raid 6? Raid 0?

If you’re talking about cloning each drive from the 4->8TB drives, it should go at “line” rate, whatever sequential speed of the disk is (probably 150-200MBps). This is because you’re doing a block level copy. You could do the same thing in linux using DD

dd if=/dev/sda of=/dev/sdb bs=1M

this will do a block level copy of drive A to drive B, will be fast.

if you were to do

rsync /mnt/4TBdrive/* /mnt/8TBdrive/

it would go really slow, because it’s file based copy.

Assuming your 4TB drive is full, and ignoring the math for transfer fall off (the further out on the disk you go, the slower you transfer due to how far the head has to go), basic math would be

150MB/s = 1GB / ~6seconds
1TB = 6000 seconds = 100 minutes
4TB = 400 Minutes =. ~6.5 hours
8x 6.5 hours = 52 hours = 2.1 days.

If you have the slots, I would recommend building the target array, and just migrating 4TB array (assuming you have an array) to the target array…copying single drives one by one to larger drives is incredibly inefficient.

1 Like

Not true. You still have to open/read the tiny files to zip, then you have to read/write the tiny files upon extraction. You aren’t really going to gain much zipping the folder, in fact it might take even longer.

straight copy:
Open /Read/Close -> Open/Write/Close

zip:
Open/ Read/Close/Write zip/Close. -> Open zip/Read/Open/Write/Close

1 Like

Cloning drives in a hardware, block-based cloner is super fast, typically nearing the sequential transfer speed of the drives. Cloners are awesome for that purpose.

… but I have some instant questions about your plan. I see warning red flags …

Are these standalone drives, or part of a storage array of any sort? RAID, redundancy group, ZFS, etc? Are they in a NAS?

With a single drive, it’s easy to clone to a bigger drive and then expand the filesystem to use all of the space.

With members of a group/array/pool, it’s an entirely different story. I wouldn’t assume that you could clone, expand, and then re-join a group/array/pool without explicit support from your storage provider.

Yes, but since the command are all local versus issuing the commands over the network, it goes a lot faster. It could probably zip up the folder in 1 to 2 hours at most, then a little faster unzipping, so it would be faster overall than waiting for the copy.

To be clear, when I say zip, I just mean creating a zip file with the store option. Actually trying to zip these would be slow.

1 Like

Using a .zip is sometimes amazing when you ARE going over the network, because the “small file operations are slow” truth applies to network operations.

For similar reasons as @evanrich, I would expect it to hurt, rather than help, in this case.

I’m assuming that the slowest part of this process is on the destination disk, creating tons of small files. I don’t think that adding intermediate steps and additional I/O operations (creating .zip, copying .zip, extracting .zip) will improve that.


@Mattardo, one EASY thing you can try is enabling write caching on the external drive.

Windows default media removal policy - Windows Client Management | Microsoft Learn

https://www.lacie.com/support/kb/how-to-improve-performance-of-an-external-drive-in-windows/

https://www.tenforums.com/tutorials/21904-enable-disable-disk-write-caching-windows-10-a.html

Windows has made this slower-by-default recently, and maybe Server has always been slow-by-default.


Edit: Something else we didn’t ask was: if you copy a single larger file to the drive, does it go “fast” as expected? If not, perhaps there’s an underlying hardware issue.

if you’re talking about something like rsync compression, only works best for text files. The one benefit of zipping would be you copy one large file somewhere else, but what you make up for in speed there you’d probably lose by taking more time to zip up on front end then extract on the remote end.

Well Good Lord that’s a lot of replies!

Thank you for the quick education - I had no idea that the smaller file sizes would contribute to the slow speed of the operation.

Teracopy was mentioned in a post as possibly being faster - but that’s one of the programs I was using, and it was giving me an estimate of days.

An SSD would be faster, and I may pick one up eventually to see if that improves the matter: both in copying speeds and database access. The assumption is correct: I’m using an SSD for C and a regular old HDD for the external.

I know the ZIP method was mentioned with some pros and cons given. A network was mentioned as being problematic, but I’m not transferring any of this over a network - strictly on the same PC, so I don’t know if that helps the matter.

The Write Cache option was not enabled - I will try it. I assume since the drive is ExFat there might be less corruption in case the drive suddenly loses power (as opposed to NFTS)?

Lots of good information in this thread. I appreciate it!

Large files copy over just fine - there seem to be no issues with the health or speed of the drive.

1 Like

No - NTFS is fundamentally a more reliable filesystem. It uses something called “journaling” so that partially-completed writes don’t corrupt the filesystem, in the case of a sudden disconnect or power loss.

ExFat can be trivially faster for some (but not all) operations, and it’s slightly more compatible with weird/old/crappy systems. That has been significantly reduced in the last 5-10 years.

I would use NTFS unless I had a specific known compatibility reason to use ExFat.

all of this why the current metadata folder/file system is WACK.

store all that crap in a DATABASE~~~~~

or a virtualized/compressed zip or image file. anything other than relying on 39847389473289473298479 tiny files and folders in a local file system.

(this complaint is to plex, not users)