Hard Crashes

Discussion in 'Mac Pro' started by haravikk, Mar 8, 2013.

  1. haravikk macrumors 65816

    Joined:
    May 1, 2005
    #1
    Okay, not the most descriptive title, but I've been experiencing hard-crashes lately where my computer simply switches off unexpectedly when otherwise seeming to run normally.

    I've tried the steps for Resetting the System Management Controller, I've also tried zapping the PRAM when the problem occurs; both solve the problem for a while (unlike simply restarting) but it seems to resurface again after a few days. The issue always occurs not longer after starting up (within 15-20 minutes or so).


    Nothing gets logged to the console at the time of the issue that gives any clues as to what caused it, though there are some oddities. Firstly I'm noticing:
    Code:
    kernel: No interval found for . Using 8000000
    Which I've never seen before, but repeats around 20 times. The only pages I found in a search for this message suggests a possible EFI problem but I'm not sure if that would show up in my log or not; the page is in german and for the OSX86 project so I'm not sure how likely it is to be useful, though they're getting the same error as I'm getting on my early 2008 Mac Pro.

    Is it worth trying to reinstall the Mac Pro EFI firmware? I'm most suspicious of this message as it also repeats right before the hard crash actually occurred.

    The other message I've been noticing that is perplexing me is:
    Code:
    kernel: hfs: Removed 0 orphaned / unlinked files and 13 directories
    The actual numbers vary but it gives no indication of what items were find to be orphaned; is this normal for HFS or am I somehow generating bad file entries? None of my volumes are failing Disk Utility's verification which seems weird.

    Anyway, in case there are any other clues, here is my system log up until launchd starts:
    Code:
    08/03/2013 09:36:47.000 bootlog: BOOT_TIME 1362735407 0
    08/03/2013 09:36:48.000 kernel: Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64
    08/03/2013 09:36:48.000 kernel: vm_page_bootstrap: 2538137 free pages and 66919 wired pages
    08/03/2013 09:36:48.000 kernel: kext submap [0xffffff7f80736000 - 0xffffff8000000000], kernel text [0xffffff8000200000 - 0xffffff8000736000]
    08/03/2013 09:36:48.000 kernel: zone leak detection enabled
    08/03/2013 09:36:48.000 kernel: standard timeslicing quantum is 10000 us
    08/03/2013 09:36:48.000 kernel: mig_table_max_displ = 73
    08/03/2013 09:36:48.000 kernel: AppleACPICPU: ProcessorId=0 LocalApicId=0 Enabled
    08/03/2013 09:36:48.000 kernel: AppleACPICPU: ProcessorId=1 LocalApicId=1 Enabled
    08/03/2013 09:36:48.000 kernel: AppleACPICPU: ProcessorId=2 LocalApicId=2 Enabled
    08/03/2013 09:36:48.000 kernel: AppleACPICPU: ProcessorId=3 LocalApicId=3 Enabled
    08/03/2013 09:36:48.000 kernel: AppleACPICPU: ProcessorId=4 LocalApicId=4 Enabled
    08/03/2013 09:36:48.000 kernel: AppleACPICPU: ProcessorId=5 LocalApicId=5 Enabled
    08/03/2013 09:36:48.000 kernel: AppleACPICPU: ProcessorId=6 LocalApicId=7 Enabled
    08/03/2013 09:36:48.000 kernel: AppleACPICPU: ProcessorId=7 LocalApicId=6 Enabled
    08/03/2013 09:36:48.000 kernel: calling mpo_policy_init for TMSafetyNet
    08/03/2013 09:36:48.000 kernel: Security policy loaded: Safety net for Time Machine (TMSafetyNet)
    08/03/2013 09:36:48.000 kernel: calling mpo_policy_init for Sandbox
    08/03/2013 09:36:48.000 kernel: Security policy loaded: Seatbelt sandbox policy (Sandbox)
    08/03/2013 09:36:48.000 kernel: calling mpo_policy_init for Quarantine
    08/03/2013 09:36:48.000 kernel: Security policy loaded: Quarantine policy (Quarantine)
    08/03/2013 09:36:48.000 kernel: Copyright (c) 1982, 1986, 1989, 1991, 1993
    08/03/2013 09:36:48.000 kernel: The Regents of the University of California. All rights reserved.
    08/03/2013 09:36:48.000 kernel: MAC Framework successfully initialized
    08/03/2013 09:36:48.000 kernel: using 16384 buffer headers and 10240 cluster IO buffer headers
    08/03/2013 09:36:48.000 kernel: IOAPIC: Version 0x20 Vectors 64:87
    08/03/2013 09:36:48.000 kernel: ACPI: System State [S0 S3 S4 S5]
    08/03/2013 09:36:48.000 kernel: PFM64 (38 cpu) 0x3f10000000, 0xf0000000
    08/03/2013 09:36:48.000 kernel: [ PCI configuration begin ]
    08/03/2013 09:36:48.000 kernel: AppleIntelCPUPowerManagement: (built 16:32:09 Aug 23 2012) initialization complete
    08/03/2013 09:36:48.000 kernel: console relocated to 0x3f10060000
    08/03/2013 09:36:48.000 kernel: PCI configuration changed (bridge=3 device=2 cardbus=0)
    08/03/2013 09:36:48.000 kernel: [ PCI configuration end, bridges 15 devices 27 ]
    08/03/2013 09:36:48.000 kernel: FireWire runtime power conservation disabled. (2)
    08/03/2013 09:36:48.000 kernel: mbinit: done [96 MB total pool size, (64/32) split]
    08/03/2013 09:36:48.000 kernel: Pthread support ABORTS when sync kernel primitives misused
    08/03/2013 09:36:48.000 kernel: com.apple.AppleFSCompressionTypeZlib kmod start
    08/03/2013 09:36:48.000 kernel: com.apple.AppleFSCompressionTypeDataless kmod start
    08/03/2013 09:36:48.000 kernel: 2.4.4 Little Snitch: starting
    08/03/2013 09:36:48.000 kernel: com.apple.AppleFSCompressionTypeZlib load succeeded
    08/03/2013 09:36:48.000 kernel: com.apple.AppleFSCompressionTypeDataless load succeeded
    08/03/2013 09:36:48.000 kernel: AppleIntelCPUPowerManagementClient: ready
    08/03/2013 09:36:48.000 kernel: BTCOEXIST off 
    08/03/2013 09:36:48.000 kernel: wl0: Broadcom BCM4328 802.11 Wireless Controller
    08/03/2013 09:36:48.000 kernel: 5.10.131.36
    08/03/2013 09:36:48.000 kernel: FireWire (OHCI) TI ID 823f built-in now active, GUID 001e52fffe6367c0; max speed s800.
    08/03/2013 09:36:48.000 kernel: CoreStorage: fsck has finished successfully for lvg "2D693030-48AC-479D-9BC3-47E939BF97E0"
    08/03/2013 09:36:48.000 kernel: USBMSC Identifier (non-unique): 0000002CE09310500758 0x4e8 0x5f06 0x0
    08/03/2013 09:36:48.000 kernel: USBMSC Identifier (non-unique): F730A46222B5 0x40d 0x6208 0x0
    08/03/2013 09:36:48.000 kernel: rooting via boot-uuid from /chosen: 819EC3E1-C0E6-335D-99AE-12C8D4DF6B97
    08/03/2013 09:36:48.000 kernel: Waiting on <dict ID="0"><key>IOProviderClass</key><string ID="1">IOResources</string><key>IOResourceMatch</key><string ID="2">boot-uuid-media</string></dict>
    08/03/2013 09:36:48.000 kernel: Got boot device = IOService:/AppleACPIPlatformExpert/PCI0@0/AppleACPIPCI/SATA@1F,2/AppleAHCI/PRT0@0/IOAHCIDevice@0/AppleAHCIDiskDriver/IOAHCIBlockStorageDevice/IOBlockStorageDriver/OCZ-VERTEX2 Media/IOGUIDPartitionScheme/Mac OS@2
    08/03/2013 09:36:48.000 kernel: BSD root: disk5s2, major 14, minor 23
    08/03/2013 09:36:48.000 kernel: Kernel is LP64
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: No interval found for . Using 8000000
    08/03/2013 09:36:48.000 kernel: USBMSC Identifier (non-unique): 0x5e3 0x723 0x9451
    08/03/2013 09:36:48.000 kernel: USBMSC Identifier (non-unique): 000A27001C97E223 0x5ac 0x1302 0x1
    08/03/2013 09:36:48.000 kernel: USBMSC Identifier (non-unique): AJJ1REPK1DARYDHZ4NYX 0x5dc 0xa793 0x1100
     
  2. justperry macrumors 604

    justperry

    Joined:
    Aug 10, 2007
    Location:
    In the core of a black hole.
    #2
    If you have a USB stick lying around, minimum 8 GB install OS X on it and see if the problem arises there as well.

    As for the "kernel: hfs: Removed 0 orphaned / unlinked files and 13 directories", I have seen that and think it's fairly normal.

    Did you try running Apple Hardware Test?
     
  3. 666sheep macrumors 68040

    666sheep

    Joined:
    Dec 7, 2009
    Location:
    Poland
    #3
    Looks that PSU is failing. Especially if there's no trace of power management kext crash in log. It happens to 3,1 more often than other models. No OS related problem would cause computer to switch off suddenly, it would rater KP or freeze.
     
  4. haravikk thread starter macrumors 65816

    Joined:
    May 1, 2005
    #4
    Will the hardware test provide any kind of confirmation for a PSU failure? And does it make sense for it to be so intermittent and seemingly fixed by an SMC reset or zapping the PRAM? Last time the shutdown occurred I tried just restarting but my computer just shut down again after a while, tried a couple of times with the same results, but when I zapped PRAM as part of another restart it was then running (seemingly) happily again. Of course I can't confirm that it was the result of what I'd done as I haven't done it enough to really say for sure, but it seemed to do it.

    I'll try it as soon as I can find ;)
    I thought I'd kept all my system disks together for all my machines but my Mac Pro hardware test seems to have wandered off. It'll turn up somewhere.

    I'll prepare a USB stick like you suggest, and maybe load it with some of the apps that the system has shutdown while running (couple of games), but given the apparent randomness of when this happens it's a bit difficult to be sure of whether a USB stick will be enough of a test.
     
  5. 666sheep macrumors 68040

    666sheep

    Joined:
    Dec 7, 2009
    Location:
    Poland
    #5
    SMC reset could cure such issue (errors cleared and defaults loaded), PRAM does not have impact on power management. But if issue is back after some time, it's a sign that it isn't caused by SMC itself. I'd bet on PSU then.
    AHT does not test PSU (voltages etc.), but only its sensors.
     
  6. haravikk thread starter macrumors 65816

    Joined:
    May 1, 2005
    #6
    I think I may have discovered the cause of the problem; I had an external hard disk dock (just slot any internal SATA drive into it) which OS X started complaining was drawing too much power over USB (and that the port had been disabled), this happened literally minutes before the dock just failed and took the drive with it. Fortunately the disk in it was just a low capacity spare that I had lying around and used for high-activity stuff like torrent downloads to keep that off my main system volume.

    Anyway, I'm wondering if power draw from the dock could have been the cause of my problem? I haven't had any hard-crashes since (*fingers crossed*), I'm just wondering if it's coincidence or not?

    The dock shouldn't have been drawing any power at all over USB other than a normal data signal, as (like practically everything) it had its own external adaptor.
     
  7. haravikk thread starter macrumors 65816

    Joined:
    May 1, 2005
    #7
    Seems that was too optimistic of me as I just had the same thing happen again, no USB power issues that I can see.

    I'm stilling getting the "No interval found for . Using 8000000" messages that I have no idea what the origin is.

    I'm also seeing some SMC errors as follows:
    Code:
    25/03/2013 11:20:09.000 kernel: SMC::smcReadKeyAction ERROR: smcReadData8 failed for key $Num (kSMCKeyNotFound)
    25/03/2013 11:20:09.000 kernel: SMC::smcReadKeyAction ERROR $Num kSMCKeyNotFound(0x84) fKeyHashTable=0x0
    25/03/2013 11:20:09.000 kernel: SMC::smcInitHelper ERROR: MMIO regMap == NULL - fall back to old SMC mode
    25/03/2013 11:20:09.000 kernel: SMC::smcReadKeyAction ERROR: smcReadData8 failed for key BEMB (kSMCKeyNotFound)
    I used to get similar errors when I used iStat Menus' temperature display, but I disabled that a while ago and immediately stopped seeing the errors. Is there any way I can find out what the errors refer to, for example what do the "$Num" and "BEMB" keys refer to?


    The crash occurred while playing a game, not long after startup, which has been the case many times before, though that may be pure coincidence (as it's an MMO that I usually log into first thing in the morning). My purely subjective assessment is that it feels as if the computer only resets when it hasn't had a chance to warm-up; not that's starting especially cold, just that I haven't encountered the problem if I start up my computer and then leave it for a while or just do something light such as listening to music.
     
  8. justperry macrumors 604

    justperry

    Joined:
    Aug 10, 2007
    Location:
    In the core of a black hole.
    #8
    You disabled iStat or uninstalled it, if you disable it there might still be two processes running, look inside LaunchAgents/LaunchDeamons, think it is there in the last version, I had to manually remove them myself.(After stopping them in Activity Monitor.
     
  9. TheEasterBunny macrumors 6502

    Joined:
    Jan 22, 2013
    Location:
    Delaware
    #9
    I was thinking the same, PSU
     
  10. Tesselator macrumors 601

    Tesselator

    Joined:
    Jan 9, 2008
    Location:
    Japan
    #10
    It could be PSU I suppose. It could be a lot of things tho really.

    This was happening to me for about a year and I just put up with it by saving often. Then one day I upgraded to a higher powered GPU, did a little OS cleaning, and it all just stopped happening. Now I never get crashes any more - like ever. I've sense sussed why but I think my example shows that there's lots of reasons MP can behave like this.

    I think justperry is on the right track for debugging this.

    First step needs to be a clean install on an alternate device (extra HDD, USP thumb-drive, whatever).
    Then if that doesn't cure it start trying to isolate various components one by one starting with the easiest to check.

    USB hubs,
    Cabling (internal + external),
    Memory (pull in pairs and test),
    GPU (got an extra or cheap 7300 to use?),
    PSU (config draw, ),
    and so on...
     
  11. haravikk thread starter macrumors 65816

    Joined:
    May 1, 2005
    #11
    I'm going to try different hardware combos to see if I encounter the same problems soon; I'm currently waiting for a NAS to arrive which I'm hoping to setup first so I have a third copy of my system (two points of backup) as I've been wanting to update to Mountain Lion on my main machine for ages but hate the idea of only having one backup during the process. I figure starting fresh will be easier for seeing if I can track down this kind of thing.

    My current suspicion is that my superdrive might be misbehaving, as it seems to be really struggling to open its drawer lately, and after these crashes it seems to make a lot more noise than normal; it's usually the first thing I hear on startup, but it seems to be trying six or more times to read a disk even though there isn't one in the drive.

    I dunno, I can foresee a long and annoying process to find out if it really is the PSU or not ;)


    If PSU failure is common enough in the 3,1 model, then I don't suppose Apple offers any kind of extended cover for that part? I remember when the G5's were having liquid cooling problems they would still replace the cooling part (and anything damaged by it) well after the machine's warranty expired.
     
  12. haravikk thread starter macrumors 65816

    Joined:
    May 1, 2005
    #12
    Hmm, I thought I'd already updated this thread. Okay, so in the end the problem seemed to be that the power cable wasn't make proper contact with the power supply's socket, which means that my temperature theory may have been partly correct; my suspicion is that when the cable heated up it caused what little contact there was to be lost. Getting the cable in just right solved the issue for several months, but now the issue is back with a vengeance.

    The contact with the power supply definitely feels loose now, and connecting the power cable at a slight angle doesn't seem to work anymore, I'm thinking some or all of the pins on the power supply may be loose, is that something I can fix myself before I go the route of replacing the whole power supply? I had a quick look on eBay and they go for £200-300 for a used one; I dunno if repair shops will get them much cheaper though of course a brand new power supply would be preferable to one that might just be about to develop the same problem as mine!

    After the hard resets though I'm getting a crash report as if a kernel panic had occurred, reporting an error of "CPU 5 has no HPET assigned" or something similar. Any idea what that could be? I did a search for the error but all the results seem to be for people building hackintoshes, I couldn't find a single person reporting the error on a genuine Mac.

    *sighs* I seem to have terrible luck with my Macs; loads of people happily use theirs for nearly a decade but I'm lucky to get more than 4 years out of mine, I doubt I'll be able to afford to switch for a new Pro, and I'm uncertain if repairing it's going to be worth the cost, like trying to row a sinking ship. Might be a Mac Mini with a big external hard drive in my future :(
     

Share This Page