Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
> RAID Monitoring
Rob Boudrie
post Sep 7 2007, 02:25 PM
Post #1


Enlightened
*

Group: Members
Posts: 61
Joined: 10-August 04
Member No.: 14,159



I recently rented a Dual Xeon server w/SCSI RAID 5 (3 drives). When I got it, I was told that ThePlanet monitored the RAID for drive failures. I filed a ticket asking (figureing the techs would know more than sales) and was informed they do not monitor it, but would attend to a RAID alarm if they noticed while walkking by the server. The suggested I use afacli to monitor the RAID array.

Do you have any documentation on how to monitor with AFACLI or, better yet, a monitoring script for a cron job? RAID 5 isn't that useful if I don't learn about a failed drive promptly so I can have it replaced before there is a second failure.
Go to the top of the page
 
+Quote Post
jbyers
post Sep 8 2007, 06:12 PM
Post #2


Enlightened
Group Icon

Group: The Planet Staff
Posts: 68
Joined: 17-March 05
From: Houston, Texas
Member No.: 16,174



I tend to recommend using this script from Dell's linux website

http://linux.dell.com/files/aacraid/aacrai...ring_script.txt

Here's the actual documention that comes with the Dell PERC RAID controllers:

http://support.dell.com/docs/storage/57kgr/cli/en/index.htm


--------------------
Houston DataCenter Operations, Level 2 Technician
Go to the top of the page
 
+Quote Post
Rob Boudrie
post Sep 8 2007, 06:17 PM
Post #3


Enlightened
*

Group: Members
Posts: 61
Joined: 10-August 04
Member No.: 14,159



I already found that script and downloaded afacli, however, from what I can see afacli may not be the correct program for the Adaptec controller in my box. I see a "megaraid" device but no "afa" device in /proc/devices. Adaptec has raid management tools on its website, but you need the serial number of the controller to download them.

I have a ticket in to ThePlanet and I expect I'll have the info shortly after the weekend is over.

I'm a bit confused about the mixed messages regarding RAID - sales told me that ThePlanet would monitor raid; the ticket support people told me they would respond to a raid alarm if they happened to notice, but do not do any monitoring. Once again, I expect that will be straightened out shortly as well.

Planet folks - how about a "RAID user's guide" for buyers who order this feature?
Go to the top of the page
 
+Quote Post
James Erickson
post Sep 8 2007, 07:23 PM
Post #4


Computer Chip
Group Icon

Group: Admin
Posts: 731
Joined: 19-November 03
From: Dallas, Texas
Member No.: 38,683



For the most part, we don't monitor the raid array unless it alarms in the datacenter, then it will be picked up on the patrol that goes by once every few hours. Can you PM me your ticket number and I'll see what I can do to get it updated for you.


--------------------
James Erickson
Red Hat Certified Datacenter Specialist
Senior Unix Systems Engineer
The Planet Internet Services
https://orbit.theplanet.com
Go to the top of the page
 
+Quote Post
Rob Boudrie
post Sep 9 2007, 07:04 PM
Post #5


Enlightened
*

Group: Members
Posts: 61
Joined: 10-August 04
Member No.: 14,159



ThePlanet did a great job of assisting once it got to the correct tech.

However....

ThePlanet's handling of RAID is seriously lacking.

1. Customers are not told RAID is unmonitored or given tools to monitor it (unless they ask).

2. Customers are given incorrect information by sales. Sales told me that "we will monitor the raid for you"

3. Tech support, in my case at least, gave out factually incorrect information telling me that AFACLI was the CLI for my raid controller even though that is not compatible with the Adaptec controller that they (in the same email) told me I had.

4. Another tech support person installed "megarc" for me and I now have access to the controller. All I need to do now is figure out how to set up a cron job to monitor the RAID status so I will find out if a drive goes bad.

RAID5 is not all that useful if there is not a mechanism to detect and replace bad drives (which is why really high end RAID systems actually submit their own service calls when a drive goes bad).
Go to the top of the page
 
+Quote Post
Jeff
post Sep 9 2007, 07:49 PM
Post #6


SuperGeek
****

Group: Members
Posts: 1,484
Joined: 18-November 05
From: Lake Michigan
Member No.: 18,911



I haven't dealt with sales in a while now, but I agree with you (in general, for every company) that it's very important that all the sales people know the systems they are selling inside and out. A good sales person should be curious about what type of raid card is in the server and how it works and be able to tell the customer or research it by asking their coworker and then get the customer the right information. Sales people that don't know the product they're selling are a big turnoff - a remember a couple years ago I got someone at another company who couldn't even tell me the brand of motherboard they were using, so I moved on; no matter how much per hour that guy was being paid it was too much since I bet he turned away more long-term business than he signed up.

QUOTE
RAID5 is not all that useful if there is not a mechanism to detect and replace bad drives (which is why really high end RAID systems actually submit their own service calls when a drive goes bad).

It sounds like all the raid controllers have audible alarms, so when a drive fails a physical buzzer goes off that is then detected when they walk through the datacenter every few hours. So you have a few hours more risk than if you had software monitoring, but with patrols every few hours your odds would have to be very bad to have a second drive in the array fail in those few hours.


--------------------
Go to the top of the page
 
+Quote Post
eth00
post Sep 9 2007, 08:20 PM
Post #7


SuperGeek
****

Group: Members
Posts: 4,856
Joined: 23-May 03
Member No.: 7,754



QUOTE (Jeff @ Sep 9 2007, 09:49 PM) *
It sounds like all the raid controllers have audible alarms, so when a drive fails a physical buzzer goes off that is then detected when they walk through the datacenter every few hours. So you have a few hours more risk than if you had software monitoring, but with patrols every few hours your odds would have to be very bad to have a second drive in the array fail in those few hours.


I don't know what cards do or do not have a buzzer but I know that not ALL do. I have had drives fail and even though it may be failed for a few days EV1/TP never contacted us.

On the flip side we have also had a ticket created because the machine was making some beeping.

I don't know the official word on if all/some have alarms, maybe he times we have had it happen were anomalies.


--------------------
John W
My personal website with many free security and linux how-to's!
Tss -- Live Support! Tweaking, Securing, 24x7 Service Monitoring, Monthly Management, Migrations, Restores, Optimization, LoadBalancer Configuration, Mysql Clusters, Custom Configurations, Consulting. English And Spanish Support!
We do it all @ TotalServerSolutions
Go to the top of the page
 
+Quote Post
Rob Boudrie
post Sep 10 2007, 11:16 AM
Post #8


Enlightened
*

Group: Members
Posts: 61
Joined: 10-August 04
Member No.: 14,159



QUOTE
but with patrols every few hours your odds would have to be very bad to have a second drive in the array fail in those few hours


It was basically an "if we notice", not "we patrol for raid alarms". I suspect a "patrol" is just someone walking through the room looking for signs of obvious physical problems, not someone watchinzeeblinkinzeelights.
Go to the top of the page
 
+Quote Post
Rob Boudrie
post Sep 11 2007, 09:42 AM
Post #9


Enlightened
*

Group: Members
Posts: 61
Joined: 10-August 04
Member No.: 14,159



One of The Planet techs located a script that drives megarc and can be easily modified to do auto notifications - fantastic and competent help (just took a few iterations to get the ticket referred to the right person).

They should provide this as "basic info" whenever someone signs up for RAID.
Go to the top of the page
 
+Quote Post
markcausa
post Sep 11 2007, 12:02 PM
Post #10


SuperGeek
****

Group: Members
Posts: 3,025
Joined: 8-July 06
From: Los Angeles, CA
Member No.: 22,425



Yeah, The Planet definitely needs a knowledge base incorporated into their forums or something...

PhireFast at one time had a whole sepetate CMS on a subdomain as our knowledge base, then we started using a sub forum on our IPB for that.

Now that we have Kayako, it's all there too.

Maybe some type of blog-like system for knowledge can be put into Orbit?


--------------------
Mark A. Mutti
W: www.phirefast.com
P: (866) 350-4456 Ext. 100
E: Mark.mutti@phirefast.com
Go to the top of the page
 
+Quote Post
Chris Interrante
post Sep 24 2007, 11:12 AM
Post #11


Newbie
Group Icon

Group: The Planet Staff
Posts: 4
Joined: 24-September 07
Member No.: 49,411



Agreed. I am working on getting a notice out to customers to assist them with monitoring the health of their RAID arrays. I have noticed that this is an area of opportunity for The Planet. Thanks for all the posts on this topic!
Go to the top of the page
 
+Quote Post
XGhozt
post Sep 25 2007, 06:32 AM
Post #12


The Friendly Ghozt
***

Group: Members
Posts: 389
Joined: 17-April 06
From: California
Member No.: 44,490



QUOTE (markcausa @ Sep 11 2007, 11:02 AM) *
Yeah, The Planet definitely needs a knowledge base incorporated into their forums or something...

PhireFast at one time had a whole sepetate CMS on a subdomain as our knowledge base, then we started using a sub forum on our IPB for that.

Now that we have Kayako, it's all there too.

Maybe some type of blog-like system for knowledge can be put into Orbit?


Wasn't there a huge knowedgebase a long time ago at TP? I've got tickets linking me to some of the articles.


--------------------



www.XGhozt.com
. [Because I'm Awesome] .
Go to the top of the page
 
+Quote Post
James Erickson
post Sep 25 2007, 07:34 AM
Post #13


Computer Chip
Group Icon

Group: Admin
Posts: 731
Joined: 19-November 03
From: Dallas, Texas
Member No.: 38,683



You may be refering to this one:
http://support.theplanet.com/knowledgebase/users/search.php


--------------------
James Erickson
Red Hat Certified Datacenter Specialist
Senior Unix Systems Engineer
The Planet Internet Services
https://orbit.theplanet.com
Go to the top of the page
 
+Quote Post
Creed3020
post Sep 25 2007, 09:27 AM
Post #14


SuperGeek
****

Group: Members
Posts: 1,004
Joined: 11-June 05
From: Toronto, Canada
Member No.: 43,162



QUOTE (jerickson @ Sep 25 2007, 07:34 AM) *


Where has that been hiding. I couldn't find an external link to that anywhere on the TP website.

Going to paruse that now icon_mrgreen.gif


--------------------
R.I.P Insomnia365
R.I.P Cortex
Go to the top of the page
 
+Quote Post
ajz4221
post Sep 25 2007, 11:23 PM
Post #15


Computer Chip
***

Group: Members
Posts: 813
Joined: 19-July 05
Member No.: 43,347



QUOTE (jerickson @ Sep 25 2007, 08:34 AM) *


They started that a while back but the link "disappeared" one day when everything started changing fast and often (mainly the site design).
(It may still be in orbit and I just havn't seen it...)
And of course, I didn't save a link back then.
Go to the top of the page
 
+Quote Post
alden
post Jan 15 2008, 12:09 PM
Post #16


Enlightened
*

Group: Members
Posts: 79
Joined: 27-March 07
Member No.: 47,903



I just ran across this thread, as I was curious as to how my RAID array is being monitored. I thought it might be nice to have a belt-and-suspenders approach to monitoring, in that I would be notified along with TP techs when a drive failed.

I submitted a ticket and found that TP still does "monitoring by walking around"! I was told that I could "install a third-party monitoring tool". But of course, I don't know what my hw raid controller is! I'm willing to dig a little, but I don't know where to start.

A kick in the right direction would be appreciated!
Go to the top of the page
 
+Quote Post
Tomy Durden
post Jan 15 2008, 02:07 PM
Post #17


SuperGeek
Group Icon

Group: Admin
Posts: 1,268
Joined: 18-May 07
From: Dallas, Tx
Member No.: 48,459



One of the DC Ops techs in each data center does a walk of the floor every four hours. On Dells, a failed array will usually trigger an audible and visual alert. Once noticed, a ticket is created for the visual/audible alert in which we request to investigate.
Most of our customers on Dell servers use Dell's OpenManage to monitor the RAID. You can set it to pop off an email upon a degradation. IPAlert doesn't have the capability of monitoring RAID status, or at least I haven't found it yet.


--------------------
Tomy Durden
Manager - Office of Change Management
Go to the top of the page
 
+Quote Post
Austin P
post Jan 24 2008, 06:42 PM
Post #18


Celery
Group Icon

Group: The Planet Staff
Posts: 28
Joined: 21-November 07
From: Houston,TX
Member No.: 49,726



QUOTE (TP-TDurden @ Jan 15 2008, 02:07 PM) *
Most of our customers on Dell servers use Dell's OpenManage to monitor the RAID. You can set it to pop off an email upon a degradation. IPAlert doesn't have the capability of monitoring RAID status, or at least I haven't found it yet.

Anything that will support 3-4 Drives is going to be a Dell Chassis so you can bet it will make a loud alarm that will be caught on prem check. As I understand it, we will install open manage for free for the customer.


--------------------
Austin Poff
Solution Sales
Direct: 281.714.3266
Check out our promotions.
Are YOU an affiliate?
Go to the top of the page
 
+Quote Post
CyberSEAL
post Jun 29 2008, 01:45 AM
Post #19


Master
***

Group: Members
Posts: 369
Joined: 12-March 02
Member No.: 1,620



Under "Hardware Details" in Orbit I can see I have the following RAID card:

Adaptec \ 4-channel SAS/SATA RAID PCI Express \ 3405

I have not yet been able to locate monitoring software for it. I have an inquiry in w/ Adaptec...in the meantime, was wondering if anyone on here has found a solution?
Go to the top of the page
 
+Quote Post
CyberSEAL
post Jun 29 2008, 02:54 AM
Post #20


Master
***

Group: Members
Posts: 369
Joined: 12-March 02
Member No.: 1,620



UPDATED:

I'm using Adaptec Storage Manager's CLI tool, included w/ their release for Linux...woot:

http://www.adaptec.com/en-US/downloads/sto...SI+RAID+2130SLP
Go to the top of the page
 
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 

Lo-Fi Version Time is now: 9th September 2010 - 01:04 AM