Results 1 to 9 of 9
Hello all:
Long time linux user here.
I started a new job and they have a Linux (CentOS 5.3 32bit) box with one disk (looks like it's a RAID 1) ...
- 07-29-2010 #1Just Joined!
- Join Date
- Jul 2010
- Posts
- 4
backing up 300,000 files
Hello all:
Long time linux user here.
I started a new job and they have a Linux (CentOS 5.3 32bit) box with one disk (looks like it's a RAID 1) for everything that has over 300,000 files in one directory. I am working on backups (backup to local disk using dump and copy dump files to remote server) and there is no other disk on another controller available. If I try to count the number of files (ls */*/*/*|wc -l) it chokes and errors out. Any suggestions?
Alex DeWolf
San Diego, California
- 07-29-2010 #2will count the files.Code:
find /DIRECTORY -type f |wc -l
On backup:
Depends, how big you plan.
Is rsync enough?
For a network wide backup I usually recommend bacula.You must always face the curtain with a bow.
- 07-29-2010 #3Just Joined!
- Join Date
- Jul 2010
- Posts
- 4
Got that. I am not sure the best way to back this up with the least I/O impact on the server. This is production.
- 07-29-2010 #4
300000 doesnt sound too scary.
My biggest number of files per server is 40million (although in a directory structure, not all in one dir)You must always face the curtain with a bow.
- 07-29-2010 #5Just Joined!
- Join Date
- Jul 2010
- Posts
- 4
How did you back it up? Was it a production server where I/O was an issue? Did you use an external (USB) device (disk drive or tape drive)?
- 07-29-2010 #6Linux Guru
- Join Date
- Nov 2007
- Posts
- 1,695
When you get into the millions-of-files volume, snapshots and "raw disk" (imaging) backups during off hours will get the best throughput.
The storage location is irrelevant as long as it can write at the speeds you require.
- 07-30-2010 #7
That is of course correct.
But dependend on your hardware´s capabilities you might be able to implement "regular" backup as well.
A raid 10 over 24x 146GByte 15k harddiscs with a decent controller is surely more performant than a raid5 with 3x 750Gbyte 7.2k and a consumer grade controller.
So -in my case- we opted for the "kill the problem with hardware" option,
just to have a consistent way of backuping our network.
Yes, this is a (one of two redundant actually) production server
and yes, it is backuped during operation.
Bottom line:
If your hardware can take it, rsync or better bacula are viable options for backing up these 300000 filesYou must always face the curtain with a bow.
- 07-30-2010 #8Just Joined!
- Join Date
- Jul 2010
- Posts
- 4
These machines are in a colocation and have only one disk controller each, so the one pipe analogy applies here. If I ran the hardware I would make sure there was 2 RAID controllers and one fibre controller each.
I have been checkinto rsync, I will check into bacula.
Thanks
- 08-06-2010 #9


Reply With Quote

