Help - Search - Members - Calendar
Full Version: Bad Blocks & IOWAIT
The Planet Forums > System Administration > Server Hardware
CyberSEAL
I've noticed high iowait on our box for the last few months and put in a ticket w/ The Planet to take a look. They ran a test to check for bad blocks and came back and told us the disk was fine.

My question is this: Do bad blocks cause iowait?
Jeff
What does the following look like?

smartctl -t long /dev/sda

smartctl -a /dev/sda
eth00
A badblocks test is one of the better ways to see if a disk is failing or otherwise having trouble. During the badblocks test you can often either get a very high load, errors on the screen, or run smoothly. If your server is already it may already have a high load so that is not always a good indication. "Ideally" on a failing drive it reports bad blocks.

You have to know what iowait is before you can really say if badblocks can cause iowait. The iowait you see in top is basically the amount of time the system is holding up an action because of the response from the disk. Going on that if you have a bad block on the OS that is causing it to re-read or skip over you are going to have a delay in the IO operations which may possibly cause the IO to raise. So the short answer is YES. A slow disk is a good indication that something in up.

Smartctl is a good test but really is not all that trustworthy, even if it says a disk is not in the best condition they are often fine for years. I also do not believe only a bad smartctl is enough for a hardware swap.

What does

iostat


show when you have the high load? If your disk is in fact doing a lot of IO the disk may be fine and just busy.
Tomy Durden
QUOTE (eth00 @ Aug 22 2007, 07:26 PM) *
A badblocks test is one of the better ways to see if a disk is failing or otherwise having trouble. During the badblocks test you can often either get a very high load, errors on the screen, or run smoothly. If your server is already it may already have a high load so that is not always a good indication. "Ideally" on a failing drive it reports bad blocks.

You have to know what iowait is before you can really say if badblocks can cause iowait. The iowait you see in top is basically the amount of time the system is holding up an action because of the response from the disk. Going on that if you have a bad block on the OS that is causing it to re-read or skip over you are going to have a delay in the IO operations which may possibly cause the IO to raise. So the short answer is YES. A slow disk is a good indication that something in up.

Smartctl is a good test but really is not all that trustworthy, even if it says a disk is not in the best condition they are often fine for years. I also do not believe only a bad smartctl is enough for a hardware swap.

What does

iostat
show when you have the high load? If your disk is in fact doing a lot of IO the disk may be fine and just busy.


Also, check your swap usage. Might be an indicator that you need to consider more memory.

As far as smart goes, it's an OK indicator. I have a 20GB at home that's been pending failure for 4 years now according to smart. An offline(manufacturer) test is the best indicator if the drive is failing, it'll take out factors, such as load. I've personally authorized HDD replacements based on smartctl and iowait alone.

In any case.. back your data up!
CyberSEAL
QUOTE (eth00 @ Aug 23 2007, 12:26 AM) *
A badblocks test is one of the better ways to see if a disk is failing or otherwise having trouble. During the badblocks test you can often either get a very high load, errors on the screen, or run smoothly. If your server is already it may already have a high load so that is not always a good indication. "Ideally" on a failing drive it reports bad blocks.

You have to know what iowait is before you can really say if badblocks can cause iowait. The iowait you see in top is basically the amount of time the system is holding up an action because of the response from the disk. Going on that if you have a bad block on the OS that is causing it to re-read or skip over you are going to have a delay in the IO operations which may possibly cause the IO to raise. So the short answer is YES. A slow disk is a good indication that something in up.


The iostat command is what I've been using to determine there's a iowait issue. It gets up to 90% at times when the server is busy. I'm extremely doubtful their techs ran a badblocks test and monitored the system using the troubleshooting method you explained above. Given the response to our ticket, I'm certain they simply ran a badblocks test, and then came back and said all is well.

I appreciate the responses, I have recently disabled mailscanner and spamassassin which were sources of high load and have replaced w/ milters. I'll keep an eye on the box and see if that helps any. Also, our data is backed up several times a week.
Jeff
The one big (and possibly obvious, by the name) drawback of the offline test is that it takes the server totally offline for several hours in order to perform it (at least when I had one done this spring.)
James Jhurani
Well, if the system is attempting to writing/reading from a bad sector it will cause i/o wait. If badblocks is run, it will mark bad blocks as... well... bad... And the system will not use those blocks. So if after a badblocks scan, the iowait is gone.. that was your problem. The only drawback is if you are experiencing high iowait, a badblocks test could take a LONG time.

As Tomy said, drive replacements have been made based solely on smartctl, as iffy as its results are... It is just better to be safe than sorry. So if you are getting bad smartctl responses, you might as well back up your data, and get the drive replaced.

good luck,
-James
jbyers
I tend to recommend checking for badblocks BEFORE creating the filesystem if possible with mke2fs -c /dev/sdbX

Optionally, you may also want to boot from a Linux LIVE CD and run a 'non destructive' read-write test with 'e2fsck -cc /dev/sdbX'

I hope this helps!
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.