Is one of my drives in my RAID failing?

Discussion in 'Mac Pro' started by alphaod, Nov 4, 2009.

  1. alphaod macrumors Core

    alphaod

    Joined:
    Feb 9, 2008
    Location:
    NYC
    #1
    Okay, I'm looking at the logs for my RAID card and this is what it says:
    Code:
    2009-11-04 15:21:25 	Mac Pro RAID 	Complete Rebuild 	001:52:28 	 
    2009-11-04 13:28:56 	Mac Pro RAID 	Start Rebuilding 	  	 
    2009-11-04 13:28:54 	RAID Set 	Rebuild RaidSet 	  	 
    2009-11-04 13:28:54 	Enc#1 Slot#4 	Device Inserted 	  	 
    2009-11-04 13:28:52 	Enc#1 Slot#4 	Device Removed 	  	 
    2009-11-04 13:28:52 	RAID Set 	RaidSet Degraded 	  	 
    2009-11-04 13:28:50 	Mac Pro RAID 	Volume Degraded
    
    Is the 4th drive failing? Or am I being paranoid? I wasn't home when this occurred.
     
  2. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #2
    Run a SMART test on it, and see what you get. It might also be worth checking if the firmware is that on the HDD Compatibility List, or newer. Older firmware than that on the list can be unstable, and result in drop-outs.

    Best I can do for now, with the information available.
     
  3. alphaod thread starter macrumors Core

    alphaod

    Joined:
    Feb 9, 2008
    Location:
    NYC
    #3
    The SMART test shows that the drive is normal. No health issues.

    The HDD is on the approved firmware list. Never had any problems until today. If it can even be called a problem. I have since checked and the cable and sledge and everything is secured.

    Code:
    Device Type 	SATA(5001B4D40588F013)
    Device Location 	Enclosure#1 Slot#4
    Model Name 	WDC WD3000HLFS-01G6U0
    Serial Number 	WD-WXL#########
    Firmware Rev. 	04.04V01
    Disk Capacity 	300.1GB
    Current SATA Mode 	SATA300+NCQ(Depth32)
    Supported SATA Mode 	SATA300+NCQ(Depth32)
    Device State 	Normal
    Timeout Count 	0
    Media Error Count 	0
    Device Temperature 	30 ºC
    SMART Read Error Rate 	200(51)
    SMART Spinup Time 	218(21)
    SMART Reallocation Count 	200(140)
    SMART Seek Error Rate 	200(0)
    SMART Spinup Retries 	100(0)
    SMART Calibration Retries 	100(0)
    
     
  4. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #4
    I've seen better.

    1. Backup the data.
    2. Run a test for Bad Blocks, and remap if needed.

    Also, what is the controller (IIRC, it's the RR2644, but I want to make sure)?
    What is the RAID level (0/1/10/5)?
    Are you using a UPS? And if so, what kind?
    How old are the drives?

    If it's a type 5, change it. That card can't deal with the write hole issue associated with parity based arrays. Whether there's an actual problem or not (you do get failure reports when there's nothing actually wrong, such as some sort of power glitch like a low voltage condition).
     
  5. alphaod thread starter macrumors Core

    alphaod

    Joined:
    Feb 9, 2008
    Location:
    NYC
    #5
    Yikes that sounds like a lot of work, but I guess I'll have to do it. The SMART reports for all 4 of my drives yield the same results (practically the only difference is the 'Spinup Time' values).

    Controller is an Areca ARC-1212 running RAID-5 with a backup battery (all of which I believe you recommended to me ;)). All drives are less than 3 months old (since that's how old the computer is). I have no UPS… but I'm working on getting a new one. Threw out my last one because the battery died.
     
  6. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #6
    I'd forgotten you ended up with the Areca. But it's a proper card with an NVRAM solution, so nothing to panic on by any means.

    It was likely a power glitch, and a UPS can help this substantially. Those that are always running off the batteries are a better unit to have, as there's no switching involved (also is seen as a short brown-out when a low wall condition happens, and it has to revert to battery seen with the Switch Type units). But as cost is always a factor, what you get is up to you. But it's not really an option to have one when using RAID.

    Also, it's good practice to run full SMART tests and check for bad blocks on ALL disks before creating the array. It can help you locate a suspect drive before you waste any time on it, and the data can be used as a reference later on as well. Especially if errors are piling up as it were, in a short period of time. It really does save you worry, and in some cases, major headaches.

    You can get flakey drives that have never been used. Just look at newegg's DOA rate (i.e. poor packaging in some cases, and the shipping Gorilla's at UPS got hold of it).
     

Share This Page