Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

big_malk

macrumors 6502a
Original poster
Aug 7, 2005
557
1
Scotland
I'm getting a lot of short glitches on my Mac Pro, during which iTunes might pause for a few seconds, text stops appearing as I type, beach balls etc.
I've looked in the console as these happens, and things like this appear a lot just after the hang is over.
02/10/2009 15:10:18 kernel AppleRAID::completeRAIDRequest - error 0xe00002ca detected for set "RAID Set" (8AFBEB6D-BFB4-487D-91FD-776E5B71C067), member 053DE1B9-D869-4845-BABA-386A855E96DF, set byte offset = 108442537984.
02/10/2009 15:10:18 kernel disk4: I/O error.

My disk "RAID Set" is two disk stripped RAID array in Bays 3 and 4.
Disk utility says both disk's SMART status is verified, but I've heard that's doesn't always mean their aren't any problems. I started getting this problem while playing videos, but it seems to be happening a lot more often, during times of less demand on the disk, which is worrying :/
Updating my backup as I type this!
 

rowsdower

macrumors 6502
Jun 2, 2009
269
1
It's definitely possible that it's a failing drive. If SMART reports a problem then there probably is one, but if SMART reports good it doesn't mean that there isn't a problem. Many drive manufacturers have testing utilities. For example, Hitachi's is here. You could try running that if you can find it for the manufacturer of your drive. Keep your backups up to date.
 

big_malk

macrumors 6502a
Original poster
Aug 7, 2005
557
1
Scotland
I think the failing drive is Western Digital, as far as I can tell they just recommend using Disk Utility.

I'm running CarbonCopyCloner now, it's taking forever and every app is fluctuating between the beachball and regular cursor when I put the mouse over their windows, so I think I'll just leave it and hope the backup complete's successfully!
I tried logging in over my network to copy some work files onto my MBP and it keeps failing to even log in, but it worked fine just a couple of hours ago.
I guess I should just be thankful this didn't happen on a Monday morning :(
 

rowsdower

macrumors 6502
Jun 2, 2009
269
1
It sounds like you'll know soon in any case if the drive is failing. At least you have backups. It seems like most of these threads end with important work being on the drive and not having any way to recover it.

As far as I know Disk Utility only checks for filesystem errors and permissions problems. Filesystem errors might be caused by a failing drive, but I don't think that Disk Utility will tell you plainly that the disk is failing. Western Digital does provide this ISO image for their drives; however, it is a DOS formatted bootable CD and I don't know if that will boot on a Mac.
 

big_malk

macrumors 6502a
Original poster
Aug 7, 2005
557
1
Scotland
I'm trying to check the warranty and I'm a bit confused.
The console error said "02/10/2009 15:10:18 kernel disk4: I/O error."
According to system profiler there is no 'disk4', the BSD names are disk0 - disk3 for my internal disks, 'disk5' is a USB pen drive, and the external volumes I'm backing up to are 'disk7s2' and 'disk7s3'.

Does it mean the drive in Bay 4? Or am I being a bit stupid here? :confused:
 

rowsdower

macrumors 6502
Jun 2, 2009
269
1
What shows up in Disk Utility? disk4 might be a virtual drive related to the RAID (i.e. the full volume of disk0+disk1+disk2+disk3). I have never looked at Disk Utility or System Profiler on a Mac Pro with a RAID so I'm not sure.
 

big_malk

macrumors 6502a
Original poster
Aug 7, 2005
557
1
Scotland
What shows up in Disk Utility? disk4 might be a virtual drive related to the RAID (i.e. the full volume of disk0+disk1+disk2+disk3). I have never looked at Disk Utility or System Profiler on a Mac Pro with a RAID so I'm not sure.

Good point, your right. In the info window of the RAID set it says 'disk4'.
I wonder if it's worth ordering a replacement right away or reformatting in some hope of it not being a hardware fail, maybe a corrupt partition or something? :/
 

rowsdower

macrumors 6502
Jun 2, 2009
269
1
I wonder if it's worth ordering a replacement right away or reformatting in some hope of it not being a hardware fail, maybe a corrupt partition or something? :/

If that's the case, I think Disk Utility would find it, as suggested by Western Digital. It's worth a shot, but it's probably a hardware failure somewhere.
 

nanofrog

macrumors G4
May 6, 2008
11,719
3
If you don't already have one, get a spare drive to replace it with ASAP. When using RAID, you always want to keep at least one spare drive handy for such problems, as they do fail. It's just a matter of when, not if.
 

nanofrog

macrumors G4
May 6, 2008
11,719
3
i'd have a look at smart utility - think there is a time limited demo. it will provide much more smart parameters on those discs
The trick is, not everyone can really interpret SMART data, assuming it doesn't just produce a FAIL indication.

IIRC, some of the Windows based SMART utils do better (i.e. presents it in a manner easier to understand), but that depends on whether or not the OP has a copy installed on the system, or is willing to if it's not.
 

big_malk

macrumors 6502a
Original poster
Aug 7, 2005
557
1
Scotland
I've got TechTool Pro 5, so I've booted that up and run some of the tests.

Volume Structures were find on the RAID volume.
SMART check has at least one indicator verging on fail on every drive I have! :eek:
Of the two disks in the RAID, one had two indicators almost on half way to fail, they were temperature related (can't remember exactly). The other disk was ECC related (error correction), which was about half way to fail, I don't know much about the intricacies of SMART, but that one sounds like it could be causing my problems!

I'v performed a surface scan on one of the RAID disks and it was fine, I'm running the check on the other and so far it says 17 bad blocks encountered.

Overall, I am not impressed with the state of my hard drives :(
 

nanofrog

macrumors G4
May 6, 2008
11,719
3
...I'v performed a surface scan on one of the RAID disks and it was fine, I'm running the check on the other and so far it says 17 bad blocks encountered.

Overall, I am not impressed with the state of my hard drives :(
What drives are you using?

I'm not a big fan of consumer grade models for RAID, even for software implementations.
 

big_malk

macrumors 6502a
Original poster
Aug 7, 2005
557
1
Scotland
The surface scan of my other RAID disks has so far found 264 bad blocks, having scanned 66,519,051 blocks, which seems pretty high to me.
Tech Tools says 'The Surface Scan test checks a hard drive for physical bad blocks'. I've been reading arond, and advice for bad blocks is often to format the HD and zero-out data. But this wouldn't help if these bad blocks are physical would it?
The more this test scans the longer it's predicted to take, 6 hours remaining and still rising, I think I'll call it quits.
 

nanofrog

macrumors G4
May 6, 2008
11,719
3
smart utility has a summary based on green/amber/red and also identifies the smart parameter(s) that generate the amber or red status.

if the op wants he can post the results here and get others more experienced to comment on the specific parameters.
I haven't had an MP in over a year now, and had forgotten it even did that. I'm more familiar with data being numerical data, and most software doesn't take the drives age into account (to obtain any idea of bad blocks/unit time).

So I'm used to having to interpret the data myself, particularly when it's in a RAID (though I usually use a hardware controller, not the system's SATA ports).

The surface scan of my other RAID disks has so far found 264 bad blocks, having scanned 66,519,051 blocks, which seems pretty high to me.
Tech Tools says 'The Surface Scan test checks a hard drive for physical bad blocks'. I've been reading arond, and advice for bad blocks is often to format the HD and zero-out data. But this wouldn't help if these bad blocks are physical would it?
The more this test scans the longer it's predicted to take, 6 hours remaining and still rising, I think I'll call it quits.
That is bad to me.

You could load up the drive maker's disk utility software (not OS X's Disk Util), and do a low level format, as that will remap the bad blocks. Follow it up with a high level format and re-create the array, and restore the data.

It's time consuming, but may help. That said, it looks like you'd want a replacement drive, or better yet, a set.
 

big_malk

macrumors 6502a
Original poster
Aug 7, 2005
557
1
Scotland
smart utility has a summary based on green/amber/red and also identifies the smart parameter(s) that generate the amber or red status.

if the op wants he can post the results here and get others more experienced to comment on the specific parameters.

I downloaded SMART utility, and things have either gotten worse, SMART utility decides things are 'failing' with a lower threshold, or there's an error in the SMART data somewhere.

disk3, a member of the RAID set, is reported as failing, which it wasn’t before. It has 757 'pending bad sectors', 0 removed or reallocated. 5737 'total errors', recent ones including 'Uncorrectable Error' and 'Unknown'.
disk2, the other member of the RAID set, passed perfectly! Huzzah!
However my downloads/scratch/other disk, is reported as failing, but as far as I can tell only because of 1 'reallocated bad block', which should be avoided by the system and it would work fine, right? Even if it is an indication of more bad blocks to come?
Incidentally, SMART utility regularly reported 'An error occurred attempting to read SMART data.'.

I had planned to replace the two RAID drives with 250GB 10,000rpm drive, assuming they cost of them would have come down since I last looked, but they seem even less common and more expensive than before? :(

I guess I could just keep this a cheap repair and only replace the 1 failing drive with the same as before (except not Western Digital, every one of them I've had has failed!!).

I'd love to use RAID 5 instead of 0, but as far as I can tell this still isn't possible in a Mac Pro without a hardware controller? Which, as far as I can tell, aren't all that cheap?

Thanks for all your help guys! :)
 

big_malk

macrumors 6502a
Original poster
Aug 7, 2005
557
1
Scotland
sounds like you have serious issues with disk 3 and minor (at least for now) with disk 2.

re 'getting worse' i think smart utility will show errors where other software will show a 'green' status. for example it flags a single reallocated bad sector on one of my disks where other s/w shows nothing.

if they're still under warranty i'd rma both.

if not then i'd get rid of 3 and maybe repurpose 2 to something low priority and use smart utility to keep an eye on it.

i do the same thing with the disk i mentioned above. you can set smart utility to ignore that single error. so you see a status of green but if it increases to 2 it'll go amber again

disk3 is out of warranty, so I've ordered a replacement and it should arrive on Tuesday :)
disk2 is still under warranty, so I might start copying everything off it and RMA it, but wouldn't one bad block be considered acceptable? Like a couple of dead pixels often are?

Thanks again for all your advice everyone :)
 

nanofrog

macrumors G4
May 6, 2008
11,719
3
disk3 is out of warranty, so I've ordered a replacement and it should arrive on Tuesday :)
disk2 is still under warranty, so I might start copying everything off it and RMA it, but wouldn't one bad block be considered acceptable? Like a couple of dead pixels often are?

Thanks again for all your advice everyone :)
Yes, one bad block would be considered acceptible.

You'd want to get the drive diagnostics utility from the drive manufacturer, and remap the bad sectors on it. But make sure you've a backup of the data first, as it will wipe the drive, and the array will need to be re-created, then restored.

It's a fair bit of work, but worth it IMO. Leaving Bad Sectors is always a bad idea in RAID.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.