Mac Pro (Late 2013) GPU (Driver) Issues

Discussion in 'Mac Pro' started by bax2003, Mar 28, 2015.

  1. bax2003, Mar 28, 2015
    Last edited: Mar 28, 2015

    bax2003 macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #1
    I want to share my experience with you.

    My MP arrived few days ago with OS X Yosemite 10.10 on it, and just after setting it up and connecting to network, I installed 10.10.2 update from AppStore and after that mac pro has locked up a few times. It was completely random and what happens is GUI just stops responding, and only thing you can do is move cursor around. Sometimes cursor changes in beach ball, other times no, but either way GUI does not respond and you must force shut down....This happened for every few hours...:( while idling or converting video.

    And so Googling starts and searching through Console....I have found few similar MP6.1 problems with Mavericks 10.9.2, but fixed with 10.9.3/10.9.4.

    Every time MP hangs, in Console entry like this appears:
    Kernel_2015-xx-xx-xxxxxx_Mac-Pro.gpuRestart...so GPU driver or GPU itself.

    To cut story short, with Mavericks 10.9.5 / Yosemite 10.10.0 installed - no problems for 3+ days.

    So if you have Mac Pro (Late 2013) be aware of 10.10.2, and of course share if you have/had similar problems.
     
  2. VirtualRain macrumors 603

    VirtualRain

    Joined:
    Aug 1, 2008
    Location:
    Vancouver, BC
    #2
    I had unexplained restarts after putting my nMP to sleep with 10.10.2 so I tried 10.10.3 beta which seems much better.
     
  3. bits macrumors member

    Joined:
    Mar 18, 2015
    #3
    I didn't have these problems. I put it on sleep many times a day (D300, 10.10.2). Also never hung up GPU.
     
  4. VirtualRain macrumors 603

    VirtualRain

    Joined:
    Aug 1, 2008
    Location:
    Vancouver, BC
    #4

    It's hard to say what might have been the cause. I'm not implying its widespread, but sometimes an update can be a cure for situations like this and maybe it would benefit the guy above.
     
  5. Infrared macrumors 68000

    Infrared

    Joined:
    Mar 28, 2007
    #5
    Mac Pro (Late 2013) GPU (Driver) Issues

    @bax

    I have had what may be similar symptoms and am also on 10.10.2. I have also had some GPU reset messages, but I don't think I am getting them at the time of each lockup. I think I will have another good look at the logs to make sure.

    I am actually very worried that this is a hardware fault, but I've not been able to pin down the cause so I cannot determine that it is.

    Unfortunately, I can't reproduce the problem at will and as it only happens once every so often. It is, therefore, a bit of nightmare to demonstrate to any tech support people.

    When the machine is apparently frozen, it's not completely unresponsive because the mouse pointer does move in response to mouse movements. And I can ssh in and do things from the command line, including rebooting the machine.

    Edit: if you can ssh in, too, you might try killing the window server, if you don't mind any programs associated with your current login session being terminated.
     
  6. bax2003, Mar 29, 2015
    Last edited: Mar 29, 2015

    bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #6
    I suspected on GPU or driver because the symptoms indicate on graphics issues and it always locks up the same way.

    If your locks up are like this (GUI not responding) and you have 10.10.2....it is 100% OS X fault.

    It is now 4th day on 10.10.0 and everything is fine (video rendering, browsing, idling...). Everything is fine on 10.9.5 as well.

    I have even tested Mac Pro with tools that can use through OpenCL both GPUs (DaVinci Resolve, Adobe Media Encoder) and developer tools (OpenGL Driver Monitor) to check what is happening through driver (AMDRadeonX4000GLDriver). Both GPU work just fine after hours and hours of OpenCL computing.

    If I did not upgraded to 10.10.2 when Mac Pro arrived, I would not know about this issue at all. I will not upgrade OS X to 10.10.1, but I will clone 10.10.0 to external drive, upgrade that to 10.10.3 when it comes out, and test it.
     
  7. edanuff macrumors 6502

    Joined:
    Oct 30, 2008
    #7
    Based on my experiences with the nMP and corroborated by some other experiences people have posted on these forums and the Apple Support forums, I would suggest that owners of the nMP be very careful when attributing GPU restarts and other display issues to driver issues or Yosemite bugs. I think that hardware issues may be responsible for more than a few of these as evidenced by the fact that many users never experience them and other users, including myself, were plagued by them until having the GPU cards replaced or machines exchanged. It's for this reason that I also recommend that nMP users invest in AppleCare.
     
  8. bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #8
    I do not know how else to interpret the fact that I only have issues in one OS X version. I compared FirePro extensions by content and they are quite different (10.10.0 vs 10.10.2). Granted there are cases with faulty GPU(s), but i cannot assume that and be without workstation for a week or more. I would rather inspect this by my own, software wise of course, and in that process get some work done.
     
  9. IowaLynn macrumors 6502a

    IowaLynn

    Joined:
    Feb 22, 2015
    #9
    Some run their system under windows where they can stress the hardware even further, no throttling, and where both GPUs are used and used differently.

    There are some cases where Apple is and has what sounds like newer GPUs, but they should be able to find out what ones are in your model based on hardware identifiers, and whether that is one of them.

    Maybe there should be a recall of some GPUs is in order??
     
  10. bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #10
    Well, this seems to be hardware issue after all because it happened in 10.10.0 as well....just few minutes ago in VirtualBox.

    Kernel_2015-03-30-032655_Mac-Pro.gpuRestart

    Mon Mar 30 03:26:55 2015

    Event: GPU Reset
    Data/Time: Mon Mar 30 03:26:55 2015
    Application:
    Path:
    OS Version: Mac OS X Version 10.10 (Build 14A389)
    Graphics Hardware: AMD FirePro D300...
    Signature: 0

    Report Data:

    GPURestartReportStart
    ------------------------
    [00] AccelChannel: GFX
    Currently pending command from UnknownCtx
    PendingCommandTimestamp: 0x01a65fc9, TotalDWords: 0x000001a7, GART Offset=0x00000000800b3800, stamp_idx=0, estamp=0x01a65fc9
    PendingCommandStart:
    PendingCommandEnd
    ------------------------
    [00] GFXHWChannel: Enabled: Idle
    IndirectCommandSize: 0x00000040, LastReadTimestamp: 0x01a65fc8, NextSubmitTimestamp: 0x01a66028
    ------------------------
    [00] HWRing: Enabled
    RingSizeInDwords: 0x4000, FreeSpace: 0x283f, Head: 0x00003300, LastSubmitPosition: 0x00000ac0, Tail: 0x00000ac0
    RB[0]_RPTR: 0x00003300, RB[0]_WPTR: 0x00003300
    HWRingDumpStart:
    0x0000a2a4 0x00000016 0x00002010 0x00020000 0xc0034300 0x8ec00000 0xffffffff 0x00000000
    0x00000010 0xc0053c00 0x00000003 0x0000217f 0x00000000 0x00000000 0x80000000 0x0000000a
    0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000
    0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0xc0023200
    0x801df800 0x00000000 0x000001a0 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000
    0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0xc0044700
    0x00000514 0x80000000 0x22000000 0x01a65ffd 0x00000000 0xc0013900 0x800434a0 0xc0100000
    0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0x80000000
    0x0000a2a4 0x00000016 0x00002010 0x00020000 0xc0034300 0x8ec00000 0xffffffff 0x00000000
    0x00000010 0xc0053c00 0x00000003 0x0000217f 0x00000000 0x00000000 0x80000000 0x0000000a
    0x80000000 0x80000000 0x80000000 0x80000000 0x80000000 0.......and so on....
     
  11. Infrared, Mar 30, 2015
    Last edited: Mar 30, 2015

    Infrared macrumors 68000

    Infrared

    Joined:
    Mar 28, 2007
    #11
    I would be cautious about drawing conclusions as to whether this is hardware related or not. Someone from Apple *might* be able to infer something from that log entry, so it might be worth forwarding it to them.

    By the way: do you sleep your Mac at all?
     
  12. Infrared, Mar 30, 2015
    Last edited: Apr 2, 2015

    Infrared macrumors 68000

    Infrared

    Joined:
    Mar 28, 2007
    #12
    I have been running some GPU memory tests using two programs which I compiled from source:

    ocl_memtest from CUDA GPU Memtest:

    http://sourceforge.net/projects/cudagpumemtest/

    memtestCL:

    https://folding.stanford.edu/home/download-utilities/

    I am attaching a zip archive with the command line executables in it:

    BUGGY PROGRAM FILE REMOVED

    Feel free to use those if you wish. Or if you prefer to compile it yourself, you can obtain the source code from the pages linked to above.

    When I ran ocl_memtest, it found a lot of errors for both cards. I am wondering if that is actually a bug in the memory testing program, which would be unfortunate. But if it so happens that other people do not find errors running that program, it is more likely to be a possible genuine GPU h/w fault.

    Running the second program, memtestCL, I have found no errors so far. To run that program at the command line, you can type the following:

    ./memtestCL 2048

    "2048" is the amount of memory to test, in MiB. When the program starts up, it will offer you a choice of device to test. You can test either GPU or the CPU. The amount of memory indicated above is chosen to match the amount of memory that each D300 has. If you have another card with a different amount of memory, or you wish to test your normal non-GPU RAM, please enter a different number.

    I would be interested in any results people have from running these tests. In particular, I would like to know if the ocl_memtest program is indeed buggy and is reporting false positives (a positive being a positive diagnosis of an error - which some might say isn't exactly the most positive happy news one could receive!).

    Cheers.

    EDIT: Oh dear. I have now found some errors running memtestCL. I don't know what to make of this.
     

    Attached Files:

  13. bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #13
    @Infrared

    Second test results
    Test iteration 1 on 2048 MiB of memory on device 0 (AMD Radeon HD - FirePro D300 Compute Engine): 0 errors so far
    Moving Inversions (ones and zeros): 0 errors (454 ms)
    Moving Inversions (random): 0 errors (681 ms)
    Memtest86 Walking 8-bit: 0 errors (5109 ms)
    True Walking zeros (8-bit): 0 errors (4639 ms)
    True Walking ones (8-bit): 0 errors (4474 ms)
    Memtest86 Walking zeros (32-bit): 0 errors (18281 ms)
    Memtest86 Walking ones (32-bit): 0 errors (18586 ms)
    Random blocks: 5898 errors (866 ms)
    Memtest86 Modulo-20: 0 errors (15414 ms)
    Logic (one iteration): 0 errors (595 ms)
    Logic (4 iterations): 0 errors (594 ms)
    Logic (local memory, one iteration): 0 errors (602 ms)
    Logic (local memory, 4 iterations): 0 errors (641 ms)

    Every time i run it, there are errors on "Random blocks" test. No errors on second FirePro D300 or CPU.
     
  14. Infrared macrumors 68000

    Infrared

    Joined:
    Mar 28, 2007
    #14
    Mac Pro (Late 2013) GPU (Driver) Issues

    @bax

    Thanks for trying that. That is the same random blocks test error I found. Note: the program found errors for both GPUs, but I found that not every test iteration reported errors. On the first run I didn't see any errors.

    There are a number of possibilities here. It could be a bug in the program. It could be a bug or quirk in the operating system, including Apple's OpenCL framework. Or it could be a hardware issue.

    If I am getting errors for both gfx cards, maybe not an issue with the cards themselves, but something that connects to them both? It's hard to imagine that both cards could be independently faulty, unless, I guess, some batch issue. Maybe I'll look at the card serial numbers next time I boot up the machine.

    I might try using an older OpenCL framework, say from Mavericks, to see if errors are still reported.

    If anyone else would like to try running the GPU memory tests, that could be helpful. In particular, if the test reports no errors for known good systems - ones not susceptible to locking up - then it starts to look like the test results might be meaningful.

    Cheers.
     
  15. Infrared, Mar 30, 2015
    Last edited: Mar 30, 2015

    Infrared macrumors 68000

    Infrared

    Joined:
    Mar 28, 2007
    #15
    Mac Pro (Late 2013) GPU (Driver) Issues

    From the author of memtestCL:

    " One thing we learned is that the Random Blocks test may not be entirely reliable on ATI (may show false positives). I'm in communication with our folks on the inside, and I'll let you know if I get any solid resolution on it."

    https://foldingforum.org/viewtopic.php?f=51&t=16119&start=30

    Sigh. Does nothing ever actually work properly in this world? :)

    It looks like there's a newer version of the source code here:

    https://github.com/ihaque/memtestCL

    Next time I'm on the desktop I will download and compile it. I didn't realize the one on f@h was out of date. Sorry about that.
     
  16. bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #16
    Great ! I am glad that you answered to this topic, its good to share experience with someone who is willing to help and share whatever he knows about common problem. This is much appreciated !
     
  17. Infrared macrumors 68000

    Infrared

    Joined:
    Mar 28, 2007
    #17
    @bax

    Could I ask a couple of questions please?

    1. What type of monitor/connector are you using?

    2. Do you ever put the machine to sleep?

    Thanks.
     
  18. bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #18
    I use two 27" Apple Thunderbolt Displays from signature, and I do not put Mac Pro to sleep, only Displays go to sleep. I checked option in Power Prefs: Prevent computer to sleeping automatically when the display is off.
     
  19. voyager77 macrumors newbie

    Joined:
    Jun 25, 2012
    #19
    I have the same problems.
    Behaviour ranges from a screen freeze, where i can still move to mouse around but nothing else to total system freeze and restart.
    On some of these restarts i have seen a red light coming from the bottom of the mac.
    Sometimes i get a GPU restart report in console, other times there is nothing. (I assume the system is so frozen it can't write a report anymore.)
    I've been in contact with apple several times already, reset nvram, clear caches, etc.
    Reinstalled OS X twice, and still this problem is there.
     
  20. bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #20
    Ok, we have the same problem. I rushed a bit with hardware problem conclusion (post #10) because all this time, from Mach 23rd I have had Mavericks 10.9.5 installed on USB 3.0 external drive which I booted a number of times, and I did not have a single crash in it. Last two days I am using 10.9.5 on Mac Pro's SSD, and it keeps working fine.

    So, if you have time for dignaostics, backup your personal stuff, or just install 10.9.5 on one PCIe SSD Mac Pro's partition and test it.
     
  21. voyager77 macrumors newbie

    Joined:
    Jun 25, 2012
    #21
    I don't have a copy of Mavericks (maybe i can download it from the store, but no idea if that's 10.9.5).
    But even if Mavericks would work, it's not an option. Many programs have been upgraded to Yosemite, including their library, and there is no way back.
     
  22. bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #22
    I am not sure that latest Mavericks is in the AppStore, but even if it is 10.9.3 / 10.9.4, update will fix that.

    I must find out is this hardware or software issue. If Mac Pro works fine in Mavericks, especially in Apps that are using both GPUs (DaVinci Resolve, Adobe Media Encoder...)...that is all i need for now. I need dependable machine.

    Which apps require Yosemite that you need ?
     
  23. voyager77 macrumors newbie

    Joined:
    Jun 25, 2012
    #23
    Imovie and aperture are not backwards compatible with mavericks version. I found that out when i needed to transport a project to my laptop.

    And i to need a machine that works perfectly, that why i plumped down some good cash to get it.

    Latest call to apple service, they suggest to take it in for service, which means being without the mac pro for a long time.
     
  24. bax2003 thread starter macrumors 6502a

    bax2003

    Joined:
    Dec 25, 2011
    Location:
    Belgrade, Serbia
    #24
    I am sorry to hear (read) that. Let us know what happened in Apple Service centar.
     
  25. Infrared, Apr 2, 2015
    Last edited: Apr 2, 2015

    Infrared macrumors 68000

    Infrared

    Joined:
    Mar 28, 2007
    #25
    Ok, at last back at the machine. I have compiled a newer version of memtestCL and here it is:

    View attachment memtestCL.zip

    Again, at the Terminal command line, one would type the following for the D300 equipped machines:

    ./memtestCL 2048

    This version is reporting no errors for my GPUs so far. The code for this newer version was grabbed from here:

    https://github.com/ihaque/memtestCL

    Note: if you test the graphics card used for your monitor, you may notice some interface stuttering.

    Note 2: I don't know if you have tried the Apple Hardware Test. If not, maybe worth a try?

    https://support.apple.com/kb/PH18765?locale=en_US
     

Share This Page