nMP - GPU failure on resume from sleep?

damezumari

macrumors member
Original poster
Dec 6, 2014
56
10
Problem: It seems that now 3 times in the last week and half, my Mac Pro's GPU has failed to resume from sleep altogether. As a result, system log gets constant messages like this (the log entry is Mavericks version, but Yosemite is very similar):

Code:
Oct 14 00:31:47 poro kernel[0]: [6:0:0] GPU HangState 0x0000000e, HangFlags 0x00000005: IndividualEngineHang 1, NonEngineBlockHang 0, FenceNotRetired 1, PerEngineReset 1, FullAsicReset 0
Oct 14 00:31:48 poro kernel[0]: ** GPU ASIC Log Start **
...
Oct 14 00:31:49 poro kernel[0]: : ** GPU Debug Info End **
...
Oct 14 00:31:49 poro kernel[0]: Trying restart GPU ...
Oct 14 00:31:49 poro com.apple.launchd[1] (com.apple.DumpGPURestart): Throttling respawn: Will start in 5 seconds
Oct 14 00:31:55 poro com.apple.launchd[1] (com.apple.DumpGPURestart): Throttling respawn: Will start in 9 seconds
Oct 14 00:31:58 poro kernel[0]: [6:0:0] GPU HangState 0x0000000e, HangFlags 0x00000005: IndividualEngineHang 1, NonEngineBlockHang 0, FenceNotRetired 1, PerEngineReset 1, FullAsicReset 0
Oct 14 00:32:00 poro kernel[0]: ** GPU ASIC Log Start **
...
Also, in System Diagnostic Reports, 'Kernel' has entries like this when the problem occurs:

Code:
Tue Dec 30 06:05:39 2014

Event:               GPU Reset
Data/Time:           Tue Dec 30 06:05:39 2014
Application:         
Path:                
OS Version:          Mac OS X Version 10.10.1 (Build 14B25)
Graphics Hardware:   AMD FirePro D500
Signature:           e

Report Data:

GPURestartReportStart
------------------------
[00] AccelChannel: GFX
Currently pending command from UnknownCtx
PendingCommandTimestamp: 0x04a5f468, TotalDWords: 0x00000193, GART Offset=0x00000000c012fd00, stamp_idx=0, estamp=0x04a5f468
PendingCommandStart:
PendingCommandEnd
------------------------
[00] GFXHWChannel: Enabled: Not Idle
IndirectCommandSize: 0x00000040, LastReadTimestamp: 0x04a5f467, NextSubmitTimestamp: 0x04a5f46a
------------------------
[00] HWRing: Enabled
RingSizeInDwords: 0x4000,  FreeSpace: 0x3fff, Head: 0x00003800, LastSubmitPosition: 0x00003800, Tail: 0x00003800
RB[0]_RPTR: 0x00003800, RB[0]_WPTR: 0x00003800
In an earlier thread ( https://forums.macrumors.com/threads/1826367/ ) I said I hadn't seen any of this on Yosemite; however, my Mac Pro wanted to prove me wrong :p. When this happens, the main display is garbled (or blank), the secondary display is blank, and the device is accessible via network, but there seems to be no way to resurrect the GPU. shutdown -r now 'fixes' the problem.

I have not changed anything recently (except turned transparency off at sometime in December), although the machine was also unused most of December. Before that, I do not think this occurred at all in November. It smells of a hardware problem to me; but as I loathe to be without my main Mac, any other options or opinions?

The device is a 6c nMP with d500. It has 2 4k monitors and fullHD projector attached to it. I bought AppleCare for it at some point, as it _did_ seem to have issues from the start and I figure if I get 3 years of useful life out of it I consider myself a winner in the transaction :).
 

edanuff

macrumors 6502
Oct 30, 2008
325
81
I was having GPU reset issues like this with my nMP for a while. My assumption was that it was a software issue since many people have posted similar problems on the various forums. I finally took it to the Genius Bar and they ended up replacing the GPUs and the motherboard. I haven't had a single GPU problem since. I have a sneaking suspicion there is a problem affecting some nMP GPUs, perhaps similar to the 2011 MBP GPU problem, that crops up intermittently when the GPUs start to get hot.
 

damezumari

macrumors member
Original poster
Dec 6, 2014
56
10
I have been hoping the assorted GPU issues I have had were software problems, but I guess not. The weird part is that this one happens only on resume and possibly also on restart - if it gets up, it is rock solid. I guess I will contact Apple and see about getting it fixed anyway, reboots are annoying and it seems to be getting more common.
 

damezumari

macrumors member
Original poster
Dec 6, 2014
56
10
All I can say for sure is that I was having similar GPU issues that disappeared after the GPU replacement.
Unfortunately my Mac Pro spent 3 weeks in the shop, got 'could not reproduce' treatment, and it is back home again. 3 out of 5 nights it crashed on resume (twice kernel panic, once continuous GPU restart loop). Somewhere in middle of that, I tried to install blank Yosemite on an external disk, and at least not using it at all, it did not crash when I let it sleep overnight (3 attempts so far).

So I can see 3 options:

- problem is related to some software I have installed

- problem is related to actually having some state at resume (I did not bother to do anything on the external disk before resuming it for the night)

- problem is related to something lingering in the filesystem (this one has done upgrades from 10.6 or 10.7 onwards)

Anyone else with this experience and/or hints on how to make it reproducible for repair guys?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.