/software/badblocks/

A friend mentioned that taking the word of the hard drive manufacturers test software may not be such a hot idea. After testing 2 hard drives with Maxtors Maxblast software they gave both drives a clean bill of health. But after using the badblocks program (part of the ext2progs package) it found one drive was indeed ok but the other one did have bad blocks on it. Do a check using both the manufacturers software and (if that passes) badblocks. You can do as many hd's as you want at a time. I would suggest just popping in a knoppix cd and testing. Some examples and explanation of the commands are below.

Badblocks is used to search for bad blocks on a device (usually a disk partition).

You NEVER want to use all of the ram in the system. If you do, kswapd will try to free up memory in real time and use 90% of the cpu. To make sure you don't starve the system of resources use the following calculation:

    TotalRamMB * 3
   -----------------   =  Maximum "-c" value
         32

So, if you have 1gig of ram (1024 MB)...

     1024*3
    -------  = 96 MB
       32

... you can use a 96MB count size.

Using 96MB count will use a total of 768MBof ram. This is less than the total amount of ram we have in the machine, but you want to give the rest of the system adiquate memory else your test will run out of ram. Running out of ram will either crash the machine, crash the badblocks test or increase your test time by a factor of 10. Not good.

Since the "-b" is a 4K block and "-c" is going to be 96MB the values must be multiplied by 1024 for the badblocks command line.

Destructive test (one test drive with 1Gig ram) *Best Test*

badblocks -b 4096 -c 98304 -w -s /dev/hda1

NON-Destructive test (one test drive with 1Gig ram)

badblocks -b 4096 -c 98304 -s /dev/hda1

Also, if you want to test 2 drives in parallel you need to divide the max "-c" number by 2.

     1024*3
    -------  = 48 MB
       64

I would suggest putting each drive on its own ide controller because this will speed up the testing a bit and reduce the load significantly. By putting each dive on its own chain you will not loose both drives if one drive dies in the middle of the test. You also need to make sure to keep the output of the tests segregated to make sure you know which drive tested with/without errors. Pipe the output to a log file is the easiest way.

Destructive test (Two test drives with 1Gig ram) *Best Test*

badblocks -b 4096 -c 49152 -w -s /dev/hda1 >> loghda1 &
badblocks -b 4096 -c 49152 -w -s /dev/hdc1 >> loghdc1 &

NON-Destructive test (two test drives with 1Gig ram)

badblocks -b 4096 -c 49152 -s /dev/hda1 >> loghda1 >> loghda1 &
badblocks -b 4096 -c 49152 -s /dev/hdc1 >> loghdc1 >> loghdc1 &

Lastly, test time. This greatly depends on the speed of your system and on the speeds for the drive and how many drives you are testing. Note that drives from the same manufacturer and with the same size do not have the same amount of blocks. We are testing blocks here. So a 250G drive with 40 million blocks will take less time to test than a 250G drive with 60 million blocks.

My times are:

P4 3.0GHz with PATA 133 250G 1 Drive == 8 hours with a load of ~1 P4 3.0GHz with PATA 133 250G 2 Drives == 14 hours with a load of ~2.2

OS and Software. Use a bootable cdrom distribution like SystemRescue CD or Knoppix. Most distributions will have "badblocks" on it.

1. Boot the cd distro.
2. Verify the hardrives are found.
3. Use fdisk or parted to make a partition on each drive to be tested. One
   large partition on each disk is the best for the destructive test.
4. Run the badblocks line of your choice.
5. Look at the output. If you just see line with testing and reading on it
   with done at the end then the drive is fine. If you see any numbers in the
   output, then those numbers are the address of the bad blocks. If you have bad
   blocks you can have the file system mark them depending on the filesystem you
   are using. RTFM for that info.

Reddit!

Related stories