Help - Search - Members - Calendar
Full Version: Logrotation killing server + causing h/d failure msgs
The Planet Forums > Operating Systems > Red Hat Linux
ramstar
Everynight at 4am during logrotation on my ensim , redhat server the httpd dies and my monitor scripts restarts httpd right away, well when it comes back up , and only during these restarts I get the error message associated with harddrive failure (also known as icon_smile.gif

secure kernel: end_request: I/O error, dev 03:03 (hda), sector 40689720
secure kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
secure kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=42987035, sector=40689728
secure kernel: end_request: I/O error, dev 03:03 (hda), sector 40689728

My questions are, please someone help me to make my logs files smaller, rotate more often or totally disappear, i dont care to log , i just need this top to not happen. I would also like to know if these hard drive failure notices are as serious as i think they are considering its only during restarts of httpd and just for a few minutes then its done. I have 40 sites on this server , none of my users know their logins for their domain registrars so its kinda like they die with the ship if we cant figure this out.

Thanks in advance, im sure you guys can give me some 3 part code to just erase my log files and be done with this, otherwise...if this hardrive dies, there goes 40 hosting customers, lord have mercy. Im going to the beach.
eth00
QUOTE (ramstar @ Mar 6 2007, 06:56 AM) *
Thanks in advance, im sure you guys can give me some 3 part code to just erase my log files and be done with this, otherwise...if this hardrive dies, there goes 40 hosting customers, lord have mercy. Im going to the beach.


May I suggest packing your bags now then?


You would have to run a few tests to be sure but almost certainly something with the hardware is going out. You may be lucky and it is only the IDE cable but it is probably the drive. If you do not have backups MAKE THEM NOW. You could link those files to /dev/null but if logrotate is killing your server chances are the drive is standing on its last legs...you really need to do something about this soon to avoid loosing the data.
ramstar
Im willing to erase the logs, i just need info how. I dont think its thats serious as no other time does it display that issue only like 3 msgs and its only at httpd restart. Any other things I can run to check the disk without unmounting active sites? Also I dont want to alert the planet techs of this at all yet since by default they are brainwashed to auto reply to all tickets with "you need to do a restore".

Suggestions on fixing this here server would be appreciated. Ive had this thing since rackshack opened, lol.
eth00
A drive test:
badblocks -v -v -v /dev/hda

If you get even one badblock it is enough to get it replaced, which will entail a restore.

I would REALLY not suggest it but if you really want to disable logging stop the syslog service, remove the logwatch cron, and in the httpd.conf for each site you would need to set the logfile to /dev/null. Alternatively you can just do ln -s /dev/null /some/log/file .

I would really suggest just running the badblocks though, oh yeah and a backup never hurts icon_smile.gif
gbock
Logging to /dev/null will work but if you don't want to log simply remove the log directives as needed in httpd.conf. Logging to /dev/null still incurs a performance hit even though it is not writing to a growing log file.
ramstar
Well i ran the badblocks its been running for like an hour, here is some of its churned up numbers :

[root@secure sbin]# ./badblocks -v -v -v /dev/hda
Checking for bad blocks in read-only mode
From block 0 to 78150744
Checking for bad blocks (read-only test): 858688/ 78150744
858688
858712712/ 78150744
858713713/ 78150744
858714714/ 78150744
858715715/ 78150744
214935166/ 78150744
214935177/ 78150744
214935188/ 78150744
214935199/ 78150744
21496561/ 78150744

214965966/ 78150744
214965977/ 78150744
214965988/ 78150744
214965999/ 78150744
21498489/ 78150744

It ended with that.

Do the above mean bad blocks or is that from me mashing the enter button in console.
Its so odd i looked at msgs all over the server and i only get that h/d failure msg when the server restarts apache and i noticed its not every night just weekends and maybe a few busy days when the server is slammed. i guess more logs = more failed h/d lol.

thanks for your info so far, least i know i have someone to talk to while my server dies and all my clients flee in horror...MMmmm ovaltine.

IS there anything else I can try other then backup ,restore , move sites to another server?! these sites have ssl certs and have been on this serve for years moving them is gonna be like a years work for me. ...snore.
eth00
If you want try re-running the test for a bit and see if you get the exact same bad blocks (each one in the list is bad). Chances are very low but if the numbers are different it may be the cable, I would not hold your breath though most likely its the drive.

You ARE going to need a new drive for it, no real way around that. It is possible you can use the disksync to do a backup, I am not sure how good it works with the older versions of ensim like you have though.
ramstar
I ran it again just now , the first 5 blocks or so of the error blocks are the same but the end ones were different. Also i noticed running top in the console just after running that command the load average: 30.30, 29.68, 19.77 #s were insane. usually my server is under .01 .01 .01 pages wouldnt even load during my scan.

I do see other posts in the forums where people got those messages im getting and kept switching out hardrives and ended up getting the same messages again so they got new drives, rinse repeat I guess it happened a few times to one guy and he ended up making a trouble ticket and rackshack told him to ignore it as its a duron/celeron related issues or something s.m.a.r.t related. I will take your advice and make backups of each site, but the ones with certs and years of gigs of files are gonna make me shoot my foot.

this topic gave me some hope that i at least have a week or a few weeks to back my stuff up.
http://forums.theplanet.com/index.php?show...mp;hl=badblocks

suggestions for the smoothest way possible to move all these sites , fastest, lesss downtime and what not. Alot of my customers dont even know there registrar logins so im kinda hurtin the simple move department.
Cant they just image my currnt hardrive onto a new hardrive and call it done? or slave this crappy drive and let me drag it all over to the other new drive? seriosuly backing all this stuff to my home pc then to reupload it makes me want to sell my kneecaps on ebay.
ramstar
Update :

the issue has corrected itself and no longer happens. Was just i/o errors with way too much stuff on one server. I moved my big traffic sites and this server and its 50 websites is running like a diamond. (well and old crappy ensim 3.11 diamonds so thats more like a lump of coal but you know)

Hugz to the helpful!
Jeff
QUOTE (ramstar @ Mar 7 2007, 02:41 PM) *
Cant they just image my currnt hardrive onto a new hardrive and call it done?

This is easy to do if you have physical access to the machine, but alas, they do not offer this service. (one reason might be that a dying hard drive may likely corrupt what's on it, so they could incur unhappy customers if they imaged data to a new drive and the os still included some corrupt files transferred over.)
QUOTE
or slave this crappy drive and let me drag it all over to the other new drive?
They will do this for you.
QUOTE
seriosuly backing all this stuff to my home pc then to reupload it makes me want to sell my kneecaps on ebay.

Better yet, purchase some NAS backup ftp space, tar everything you want/need, move the tar.gz file(s) over to the backup space, off-server. Then move them back once the new drive is installed.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2010 Invision Power Services, Inc.