RAID weirdness: HD rails then resurrects itself!

Discussion in 'Mac Basics and Help' started by cb911, May 25, 2006.

  1. cb911 macrumors 601

    cb911

    Joined:
    Mar 12, 2002
    Location:
    BrisVegas, Australia
    #1
    I was awoken by a very loud beeping at 1:00 last night. It was the RAID controller card in my PM. I had a look and saw this message...

    Wed, 24 May 2006 00:58:07 EST:
    Disk 'Maxtor 6V300F0' at Controller1-Channel4 failed.

    So I just shut down to make sure that no other drives failed, I had to get this replaced before using it because I really didn't want to risk losing any data.

    Now to add to the mystery, I decided to boot up just now and check that the drive really was bust before I took it in for warranty repair, and I see "Rebuilding 3%"? :eek: I checked the "failed" drive and the S.M.A.R.T status checks out 100% OK, so I don't know what the problem is? :confused:

    Anyone come accross something like this before? It's really weird, maybe it was a power spike or something? (Although I am connected through a surge protecting power board...)

    And how do I stop that darn beeping next time? :eek:
    (Besides from shutting down...)
     
  2. Eniregnat macrumors 68000

    Eniregnat

    Joined:
    Jan 22, 2003
    Location:
    In your head.
    #2
    Drives just sometimes fall out of the RAID. This is just my expirence, and I am not running a Mac based RAID.

    Why? There is almost always a reason, but it is sometimes hard to figure out. A drive can check 100% SMART OK and still be headed to failure. I trust the SMART status, but other things can cause a drive to fall out of a RAID. How are the cables? Power (as you noted) or the computers internal power supply? Did it die at the same time your fridge or AC kicked in? I have had a drive fail under thermal stress, and never report a r/rw error in the SMART. It had do something with electronics on the drive.

    I would look at the logs, and if it is under warranty, have it checked out. At worst it is the controller card. At best it was a random occurrence.

    As for power, I really would get a good UPS. (APC is a company I trust). I have a $3k power conditioner/UPS on order to keep our RAIDS at tip top shape. The 2 150lb UPSs are good, but they aren't providing a true sine out and are not catching the low % drops outs quick enough.
     
  3. cb911 thread starter macrumors 601

    cb911

    Joined:
    Mar 12, 2002
    Location:
    BrisVegas, Australia
    #3
    I could understand a drive failing, that's going to happen to all drives at one point or another. But then a previously 'failed' drive going back to completely normal operation? That's the thing that's got me.
     
  4. Eniregnat macrumors 68000

    Eniregnat

    Joined:
    Jan 22, 2003
    Location:
    In your head.
    #4
    Drives sometimes just fall out of RAIDS. The protective fault level for a RAID is pretty low. If you have the logs, I would really look at them. Was it one of the middle drives?

    Generally the housing on the HDs helps conduct heat away. The plastic might be insulating heat energy, and with out knowing what the error was, It is only a guess that it might have something to do with a breakdown in communication to the drive or in the drives own controller.

    There is one option, interference caused by induction of some sort. The signals ribbon cables can be interfered with, and the cables can fail over time also. This failure would not result in a SMART error, as the information was lost or corrupted with in the drive. If the problem is internal, between a the drives internal controllers and the platters, it is likely that a SMART error would have been noted. If the signal was lost or became incoherent for a time between the drive and the RAID controller, then a SMART error would not be generated. Check the cables.

    RAIDS generate a lot of heat, mine are like blowdryers on 24/7. Then again, they are not Apple RAIDs.
     

Share This Page