Results 1 to 3 of 3
I am currently in the middle of a project where I need to move about 3 million files in the 32KB to 10MB range to a new server.
I was ...
- 03-02-2009 #1Just Joined!
- Join Date
- Mar 2009
- Posts
- 1
Where is the bottleneck in copying files
I am currently in the middle of a project where I need to move about 3 million files in the 32KB to 10MB range to a new server.
I was planning to just copy these files to a USB attached hard drive or an EIDE drive installed to the IDE adapter on the existing server's motherboard.
I am able to copy a 1GB file from the RAID5 array to the USB drive and the EIDE drive at an okay 30MB/sec. However; when I start copying the smaller files my throughput drops to about 3.3MB/sec.
I have used tar, star, rsync, cpio, cp and none of them seem to get above that 3.3MB/sec barrier to the USB drive or the EIDE drive.
I wrote a program yesterday to copy everything from one directory to another and I didn't attempt to preserve access times, user rights, or anything else. Just readdir, open, read/write, and close. No stat call. I was still only able to get a throughput of 3.5MB/sec doing that.
Can anyone tell me where the overhead comes as far as opening the file and then reading the data? I know from my 1GB test that there is definitely enough bandwidth to get me to 30MB/sec, but I have to get past the overhead problem.
Would it be of any benefit to have the main program loop read the directory and then create threads to open and copy the files?
Thanks for any direction.
Darrell
- 03-26-2009 #2Linux Enthusiast
- Join Date
- Aug 2006
- Location
- Portsmouth, UK
- Posts
- 539
I've run into this problem in the past.
IO Performance drops the more files you have in a directory. And copying a single 1GB file will always be quicker than copying 1,024 1KB files ( a lot less operations required for one big file ).
I didn't find a quick method to copy the files, an option maybe to "dd" the drive / partition from one server to the other as this won't require the OS to query the file tables.RHCE #100-015-395
Please don't PM me with questions as no reply may offend, that's what the forums are for.
- 03-27-2009 #3
It doesn't matter which way you do it (other than dd) because it has to access every file. Also, are you tar'ing directly to the USB device? This won't help. The quickest way to do it, if you have the space on the local drive, is to create the tarball on a local partition, then just copy the 1 file across the USB link. No easy way out of it unfortunately...


Reply With Quote