Results 1 to 4 of 4
Greetings, enlightened reader... I work in an academic research lab. We recently discovered that the 5-disk RAID5 array in our main workstation was not being monitored by the cron job ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 06-19-2012 #1
- Join Date
- Jun 2012
RAID5 woes, time to send to a commercial data recovery expert?
I work in an academic research lab. We recently discovered that the 5-disk RAID5 array in our main workstation was not being monitored by the cron job that was supposed to be monitoring it. We discovered it when the second drive failed in some way and the whole filesystem disappeared.
No problem, I thought. We've got everything backed up on a second external RAID. I replaced the two failed drives, and a third that smartctl revealed was having a few read errors, and built a new md0 in the workstation.
I realized that I could copy data onto the new array even while it was in the process of re-calculating parity bits. I started doing that, then realized that the script that had been supposedly backing up from the main array to the external backup ALSO had not been working, for the last seven months. I immediately stopped the rebuild, about 10-15% of the way through I'd guess, and cursed my fate and stupidity.
I'm currently backing up all eight involved drives (the original 5, plus the 3 new ones that may now carry some useful information) using dd conv=noerror. Once I'm done with that, I'm sort of at a loss. I've been reading voraciously over the past week to try to figure out the best course of action. It seems like there might be some way that I might be able to get back at least some of the data that I care about. It also seems like the more I play with things, the harder it will be to recover anything. Paying for a professional recovery ($5000-$10000) is a colossal strain on the lab's budget; but is it the only way to go? Or are there some other resources I'm unaware of, or things that I could try without risking making things worse?
Many many thanks for any advice anybody can give.
- 06-19-2012 #2
I agree, in theory some of the data should be recoverable.
Over the thumb: (90-85%) minus the damaged sectors of the three drives.
As a start, you could create images of the two working drives with dd
and of the three damaged ones with ddrescue.
Now you can decide what to do.
Send the images to a professional data rescue service or try yourself.
If you try it yourself, you should create working copies.
Which means you need at least 2 times the capacity of the sum of the drives.
The original images are not to be touched, you only modify the working copies.
On the working copies, you could try to recreate the meta structures.
This is far from being a trivial task.
It requires solid understanding of the raid logic and structure.
Personally, I would hesitate to do this, as I dont have that deep knowledge.
And *if* I would attempt it, then I would probably take several weeks.
ymmv.You must always face the curtain with a bow.
- 06-19-2012 #3
- Join Date
- Jun 2012
Thanks, Irithori. A quick followup question:
is ddrescue any different from dd conv=noerror? If they are different, it's not clear to me how.
- 06-19-2012 #4
The assumptions for these two tools are different.
dd expects a working device, ddrescue a broken one.
Hence ddrescue has options for that, e.g. retry to read a damaged sector X- or even infinitive times.You must always face the curtain with a bow.