Salvaging hard drives with ddrescue
DISCLAIMER: Below I describe my experiences and solutions that worked in my case. If you follow the suggestions, you should understand what you are doing and how ddrescue works. If you are a tiny bit unsure and critical data is at stake, go consult a professional data recovery company. They are surprisingly cheap.
Recently I was faced with a failing hard drive. It was an external 320 GB USB drive that was only 6 months old and was always handled carefully. But now it refused to mount and S.M.A.R.T. was reporting bad sectors. And let's not get stuck with discussing timely backups...
Old drives can have some bad sectors and that is allright, but a new drive with bad sectors or a drive with increasing number of bad sectors should initiate "a data evacuation". The idea is that you should read out the good parts as quickly as possible, because every time the drive spins, it might be the last time. For evacuation you need a healthy drive as a target drive. After data is copied, you can use data recovery software such as iRecover to dig out the files from the healthy drive.
Reading a drive with bad sectors is not easy. Every attempt to read a bad sector causes a wait due to retries and timeouts. Even if the ratio of bad sectors is minimal, you still have a Huge Number of them because modern hard drives are huge. And when you multiply short wait time by that Huge Number, you get A Very Long Time to read through the drive. In my case, an overnight of churning the disk did not result into any measurable progress.
So using any of the normal data moving tools, like cp or dd, does not work. However what does work is a very nice open source utility called ddrescue. It is basically a version of dd that can handle partly functional drives.
For information on ddrescue, see the manual:
http://www.gnu.org/s/ddrescue/manual/ddrescue_manual.html
Before you start, go and get the latest version of ddrescue. At least with my Ubuntu, the default version was a bit old and missed some good features.
The first step is to run ddrescue with -n flag, so that it will skip over all problematic areas, making the first pass quick and easy on the failing drive. In my case it recovered 77 GB of data, which was already a lot better than nothing.
Next part was trickier. I could not come up with a parameter combination that would have made ddrescue read through the troubled areas efficiently. Without -n, it was always doing some sort of linear read that got stuck eventually. However I found out that I was able to manually start the read from some good location and stop it when it got stuck. This way I was able to recover more data, but it was a lot of work.
Fortunately that process could be automated by writing a bit of script code. This is what I came up with and it worked very well for my case.
#!/bin/perl
#$status = "[*\?/]"; # try also other types of blocks
$status = "[\?]";
@array=`grep $status logfile`;
while (scalar(@array) > 0) {
@randomline=split(/ /, $array[rand @array]);
$cmd = "ddrescue -f -E 10000 -i " . @randomline[0] . " /dev/sda /dev/sdb logfile\n";
print($cmd);
system($cmd);
@array=`grep $status logfile`;
}
The script uses ddrescue logfile to come up with good candidates where to look for readable data. It picks a random untried block and starts ddrescue from there. The key parameter is -i that gives the start offset. Another important parameter is -E that defines the maximum error increase, so that ddrescue will quickly give up and we pick another random location. A value of 10000 bytes worked well for me, but you might want to tune it to your needs.
Now I have recovered 317 GB, which means about 99% of the drive. It will probably go even higher if I go on running that script. It would be nice to hear your experience from rescuing hard drives and ways to get best out of ddrescue. This was my first and hopefully last serious hard drive rescue operation, so my approach is by no means an optimal one.
A couple of things I learned in the process: ALWAYS double check the source and destination devices when issuing ddrescue commands. You will be working in multiple sessions and external drives will not always have the same device names.
The common advice is that you should open USB drive boxes and connect them directly, not via USB. I could not be bothered about this and it did not seem to matter. The screw in the drive box was not compatible with my screw drivers and a trip to hardware store would have been required. Mind you, I am a programmer, not a hardware guy...
Can you give instructions how to run your script?
One more thing... If I left the cloned drive with 2MB of bad data, would I be able to use this drive in the system again? It came from a Windows XP box formatted in NTFS. I planned on running chkdsk /v /f on the drive after ddrescue finishes.
Thanks,
Permission denied
root@sysresccd /root/Documents % ./ddrescue_script.sh
zsh: permission denied: ./ddrescue_script.sh
Can you tell me what version of perl is required to run your script?
Thanks,
Using the script
chmod u+x ddrescue_script.sh
./ddrescue_script.sh
Shutting down the script is a bit more challenging, because it starts new processes of ddrescue. So just pressing Ctrl+C is not enough, but instead, you need to check your processes with "ps" and kill the script with "kill".
It is also a good idea to run "ddrescue -n ..." after the script has been running for a while and splitting bad areas. The option -n makes ddrescue to only try the easy parts. Combining my script and "ddrescue -n ..." seemed to be the most efficient way to pull out data.
If you have 2MB of bad data, the drive might work quite well, or then not. It depends on where the bad areas are. You probably can mount it, but the filesystem might look like a mess. In my case, I had 10MB of bad data and the drive was mountable, but looked mostly like a pile of crap when accessing it directly through the file explorer. I strongly recommend using a tool such as iRecover to dig out the files.
I have not used chkdsk for this, but typically you are advised not to use it when recovering drives. I would copy important files out with iRecover or other recovery tool, and then recreate the drive from scratch. If it was a system drive, I would install the OS again.
permission denied: ./ddrescue_script.sh
Thanks
#!/usr/bin
#$status = "[*\?/]"; # try also other types of blocks
$status = "[\?]";
@array=`grep $status ddrescue_logfile.log`;
while (scalar(@array) > 0) {
@randomline=split(/ /, $array[rand @array]);
$cmd = "ddrescue -f -E 10000 -i " . @randomline[0] . " /dev/sda /dev/sdb ddrescue_logfile.log\n";
print($cmd);
@array=`grep $status ddrescue_logfile.log`;
}
Seems correct
http://www.gnu.org/s/ddrescue/manual/ddrescue_manual.html
And indeed, if you leave out the system-command line as you did, then nothing will be done but only commands are echoed to the screen.
How Long did you take to copy out 320GB of data?
I would like to know how long did you take to copy out the whole HDD.
For mine a 500GB i run it for >2 weeks yet only <60GB is copied...
the Current Rate is hovering around 50-2048B/s
The Error already reached 1265.
Any way of speeding it up?
Running time
You could try running plain ddrescue again with the -n parameter. Now that the random access script has been splitting the failed areas, "ddrescue -n" can find new data to recover and will do it quickly.
If the read rate is low, you can also stop the ddrescue process with Ctrl+C, which makes it stop cleanly (takes a while). Then the script will continue iterating and starts ddrescue from some other location.
I found that there were no bombproof settings, but running ddrescue with different settings at different phases was the way to get data out. The good thing is that your already rescued data will not go anywhere and your progress is stored to the log file all the time, so you can try and experiment with different settings. If the read rate drops low and stays there for more than a minute, then it might be a good idea to kill it (Ctrl+C) and try different settings or rescue location.
Awesome
Now I'll try what you describe. Thanks in advance.
Don't use dd_rescue, use ddrescue!
Debian still insanely distributes the old ddrescue as "ddrescue", this new one is distributed as gddrescue. Don't ask me why, but it's nonsense to me.
I already used it without the -n option, I'm going to try with your script. The results are quite good, it seems.
GNU ddrescue 1.16
Press Ctrl-C to interrupt
rescued: 14834 MB, errsize: 0 B, current rate: 32440 kB/s
ipos: 14834 MB, errors: 0, average rate: 34259 kB/s
opos: 14834 MB, time since last successful read: 0 s
Finished
GNU ddrescue 1.16
Press Ctrl-C to interrupt
rescued: 86035 MB, errsize: 7379 kB, current rate: 0 B/s
ipos: 3680 MB, errors: 1621, average rate: 1119 kB/s
opos: 3680 MB, time since last successful read: 2.1 h
Finished
Unable to run the script...
I'm able to do it manually, I wonder why this happens :(
Would need more info
error message when running the script
I have a Western Digital HD, went wrong. Trying to salvage data from it with your script. The first round with -n switch (ddrescue -f -n /dev/sda /dev/sdb rescue.log) was short:
rescued: 59162 kB, errsize: 750 GB, current rate: 0 B/s
ipos: 59179 kB, errors: 1, average rate: 65736 B/s
opos: 59179 kB, time since last successful read: 1 s
Finished
Now I am trying second with the script, used chmod, but after when running it got this message:
./ddrescue_script.sh: bad interpreter: /bin/perl: no such file or directory
What is the trouble? I use Terminal after startx to get graphical screen for System Rescue CD.
thank you
Sandor
error message when running the script - 2nd
You need to install Perl
To install Perl interpreter, do one of these, depending on your Linux distribution:
sudo aptitude install perl
sudo yum install perl
perl
Anyway, I found clues on PCB company site about my problem. It looks like an internal hardware error :-(.
Thanks for your help
Result
You can still use ddrescue's normal features for trying to read those bad blocks.
bad interpreter...
I was following your instructions
#chmod u+x ddrescue_script.sh
#./ddrescue_script.sh
The error message I got is
zsh: ./ddrescue_script.sh: bad interpreter: /usr/bin^M: no such file or directory
I found the perl in /usr/bin and I could not run
#sudo aptitude install perl
nor
#sudo yum install perl
I am quite a newbie in the gnu/linux world as many others so I need some hint.
..
You should really consult someone who is more experienced with Unix. Disk recovery with command line tools is quite advanced and I would not recommend it to newbies, at least if the contents of the disk are important.
ddrescue : E parameter and speed
Very useful script, but I get much better results with "-E 400".
The fact is that E is the maximal rate of errors in bytes per second.
Hence in your script, if you have areas read at the very low rate of 512 B/s, and if such areas are read successfully, "-E 10000" does not prevent to skip over those ultra-slow areas.
In my case, setting "-E 400" forced ddrescue to run above 10000 kB/s, as it makes it very intolerant to errors. For the next passes, it's of course possible to progressively increase the value of E.
Automating ddrescue : setting a better loop
I did not check what "$array[rand @array]" does exactly, but there is an issue that is related to ddrescue itself.
"-i offset" never works if specified offset is smaller than current ipos.
To be able to set the "-i" offset, you have to run a command like "ddrescue -i 320GB -f /dev/sdb /dev/sdc /mnt/logfile.txt", assuming here that your disk has capacity of 320GB.
On the counter part the "-i" value is taken into account if higher than the current ipos.
I would like your script to make "much higher forward" (and only forward jump) when a bad area is encountered. The problem is that ddrescue stays in damaged areas where the speed is low, although I can observe the script correctly looping and performing the calls to ddrescue. The ipos does not change much like I would.
I also observed that using "-E" does not prevent very slow reading speeds.
I tried using "-a" instead but without success.
This is a mystery for me if and how does "-a" work.
Cannot exit script that calls ddrescue
I also tried several "Ctrl+C" before calling "ps", but it doesn't change anything.
In my case running your script ddrescue stayed about 12 hours in the same Gigabyte (and is still here). Read spead is most of the time 0 B/s. Excessive error rates are detected, but as there are many blocks in the same Gigabyte, ddrescue remains prisoner of it.
I ran your script with "-E 400". The first pass gave very good results; but not the second one. I'm not sure how does the random part of your script work exactly.
ddrescue : what are un-trimmed blocks ?
I understand this as something that could not be read because of some failure, with the possibility to re-trimm such blocks. But what is the difference with failed blocks because of bad sectors? In which cases does the "non-trimmed" case happens?
ddrescue
"Non-trimmed" is ddrescue vocabulary, meaning a block where there is a bad area. It marks the bad sector as bad and rest as non-split. Later on, it reads and splits the non-split blocks. For more info on ddrescue algorithm, have a look at ddrescua man page.
Follow-up
The status at that point was very good:
rescued: 320062 MB, errsize: 10281 kB, errors: 1421
Only 10 MB's of data was lost. That means 0.99997 recovery.