Salvaging hard drives with ddrescue
DISCLAIMER: Below I describe my experiences and solutions that worked in my case. If you follow the suggestions, you should understand what you are doing and how ddrescue works. If you are a tiny bit unsure and critical data is at stake, go consult a professional data recovery company. They are surprisingly cheap.
Recently I was faced with a failing hard drive. It was an external 320 GB USB drive that was only 6 months old and was always handled carefully. But now it refused to mount and S.M.A.R.T. was reporting bad sectors. And let's not get stuck with discussing timely backups...
Old drives can have some bad sectors and that is allright, but a new drive with bad sectors or a drive with increasing number of bad sectors should initiate "a data evacuation". The idea is that you should read out the good parts as quickly as possible, because every time the drive spins, it might be the last time. For evacuation you need a healthy drive as a target drive. After data is copied, you can use data recovery software such as iRecover to dig out the files from the healthy drive.
Reading a drive with bad sectors is not easy. Every attempt to read a bad sector causes a wait due to retries and timeouts. Even if the ratio of bad sectors is minimal, you still have a Huge Number of them because modern hard drives are huge. And when you multiply short wait time by that Huge Number, you get A Very Long Time to read through the drive. In my case, an overnight of churning the disk did not result into any measurable progress.
So using any of the normal data moving tools, like cp or dd, does not work. However what does work is a very nice open source utility called ddrescue. It is basically a version of dd that can handle partly functional drives.
For information on ddrescue, see the manual:
http://www.gnu.org/s/ddrescue/manual/ddrescue_manual.html
Before you start, go and get the latest version of ddrescue. At least with my Ubuntu, the default version was a bit old and missed some good features.
The first step is to run ddrescue with -n flag, so that it will skip over all problematic areas, making the first pass quick and easy on the failing drive. In my case it recovered 77 GB of data, which was already a lot better than nothing.
Next part was trickier. I could not come up with a parameter combination that would have made ddrescue read through the troubled areas efficiently. Without -n, it was always doing some sort of linear read that got stuck eventually. However I found out that I was able to manually start the read from some good location and stop it when it got stuck. This way I was able to recover more data, but it was a lot of work.
Fortunately that process could be automated by writing a bit of script code. This is what I came up with and it worked very well for my case.
#!/bin/perl
#$status = "[*\?/]"; # try also other types of blocks
$status = "[\?]";
@array=`grep $status logfile`;
while (scalar(@array) > 0) {
@randomline=split(/ /, $array[rand @array]);
$cmd = "ddrescue -f -E 10000 -i " . @randomline[0] . " /dev/sda /dev/sdb logfile\n";
print($cmd);
system($cmd);
@array=`grep $status logfile`;
}
The script uses ddrescue logfile to come up with good candidates where to look for readable data. It picks a random untried block and starts ddrescue from there. The key parameter is -i that gives the start offset. Another important parameter is -E that defines the maximum error increase, so that ddrescue will quickly give up and we pick another random location. A value of 10000 bytes worked well for me, but you might want to tune it to your needs.
Now I have recovered 317 GB, which means about 99% of the drive. It will probably go even higher if I go on running that script. It would be nice to hear your experience from rescuing hard drives and ways to get best out of ddrescue. This was my first and hopefully last serious hard drive rescue operation, so my approach is by no means an optimal one.
A couple of things I learned in the process: ALWAYS double check the source and destination devices when issuing ddrescue commands. You will be working in multiple sessions and external drives will not always have the same device names.
The common advice is that you should open USB drive boxes and connect them directly, not via USB. I could not be bothered about this and it did not seem to matter. The screw in the drive box was not compatible with my screw drivers and a trip to hardware store would have been required. Mind you, I am a programmer, not a hardware guy...
Can you give instructions how to run your script?
One more thing... If I left the cloned drive with 2MB of bad data, would I be able to use this drive in the system again? It came from a Windows XP box formatted in NTFS. I planned on running chkdsk /v /f on the drive after ddrescue finishes.
Thanks,
Permission denied
root@sysresccd /root/Documents % ./ddrescue_script.sh
zsh: permission denied: ./ddrescue_script.sh
Can you tell me what version of perl is required to run your script?
Thanks,
Using the script
chmod u+x ddrescue_script.sh
./ddrescue_script.sh
Shutting down the script is a bit more challenging, because it starts new processes of ddrescue. So just pressing Ctrl+C is not enough, but instead, you need to check your processes with "ps" and kill the script with "kill".
It is also a good idea to run "ddrescue -n ..." after the script has been running for a while and splitting bad areas. The option -n makes ddrescue to only try the easy parts. Combining my script and "ddrescue -n ..." seemed to be the most efficient way to pull out data.
If you have 2MB of bad data, the drive might work quite well, or then not. It depends on where the bad areas are. You probably can mount it, but the filesystem might look like a mess. In my case, I had 10MB of bad data and the drive was mountable, but looked mostly like a pile of crap when accessing it directly through the file explorer. I strongly recommend using a tool such as iRecover to dig out the files.
I have not used chkdsk for this, but typically you are advised not to use it when recovering drives. I would copy important files out with iRecover or other recovery tool, and then recreate the drive from scratch. If it was a system drive, I would install the OS again.
permission denied: ./ddrescue_script.sh
Thanks
#!/usr/bin
#$status = "[*\?/]"; # try also other types of blocks
$status = "[\?]";
@array=`grep $status ddrescue_logfile.log`;
while (scalar(@array) > 0) {
@randomline=split(/ /, $array[rand @array]);
$cmd = "ddrescue -f -E 10000 -i " . @randomline[0] . " /dev/sda /dev/sdb ddrescue_logfile.log\n";
print($cmd);
@array=`grep $status ddrescue_logfile.log`;
}
Seems correct
http://www.gnu.org/s/ddrescue/manual/ddrescue_manual.html
And indeed, if you leave out the system-command line as you did, then nothing will be done but only commands are echoed to the screen.
How Long did you take to copy out 320GB of data?
I would like to know how long did you take to copy out the whole HDD.
For mine a 500GB i run it for >2 weeks yet only <60GB is copied...
the Current Rate is hovering around 50-2048B/s
The Error already reached 1265.
Any way of speeding it up?
Running time
You could try running plain ddrescue again with the -n parameter. Now that the random access script has been splitting the failed areas, "ddrescue -n" can find new data to recover and will do it quickly.
If the read rate is low, you can also stop the ddrescue process with Ctrl+C, which makes it stop cleanly (takes a while). Then the script will continue iterating and starts ddrescue from some other location.
I found that there were no bombproof settings, but running ddrescue with different settings at different phases was the way to get data out. The good thing is that your already rescued data will not go anywhere and your progress is stored to the log file all the time, so you can try and experiment with different settings. If the read rate drops low and stays there for more than a minute, then it might be a good idea to kill it (Ctrl+C) and try different settings or rescue location.
Follow-up
The status at that point was very good:
rescued: 320062 MB, errsize: 10281 kB, errors: 1421
Only 10 MB's of data was lost. That means 0.99997 recovery.