High performance rendering with 3 x GTX 980 Ti connected to a Classic Mac Pro

Discussion in 'Mac Pro' started by Machines, Feb 23, 2016.

  1. Machines macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #1
    P { margin-bottom: 0.08in; }

    A word of caution before anyone tries this at home - this project is not the same as a retail product and none of the device connections are hot swappable . Meaning , if the devices are not attached properly or otherwise come loose during operation damage could occur to your Mac, the attached devices or both . Also , booster power to the GPU array must be active before the host computer is started up . Try this project at your own risk and enjoy the results when you do.


    About a year ago, my flagship annual project involved installing and internally powering Dual GTX 970 Maxwell graphics cards for rendering purposes in a Nehalem Mac Pro . The benchmark results were respectable , especially with Cuda Core optimized applications .

    This year's project involves connecting three even higher end Maxwell GPUs to another Nehalem Mac Pro , but they required external mounting for various reasons . The results are nothing less than stellar , easily surpassing last year's benchmarks and reinvigorating the usefulness of these older Macs . This is especially true since the material cost of the entire project does not exceed the cost of an empty Cubix expansion chassis (before any graphics cards are purchased) .



    What this project involves is connecting three externally mounted non-EFI GTX 980 Ti video cards via heavily shielded PCIe cables to the host Mac Pro workstation's PCIe interface slots . Each card has a discrete cable and no splitting is used . An external Power Supply Unit with six eight pin VGA power connectors was necessary in order to provide booster power to each of the video card's two eight pin connectors . An open air frame / test bench (without a motherboard or any other component installed) was used to secure and properly position the GPUs . No additional fans were added to cool the GPUs at load, although adding them might prove useful . It doesn't take long for the cards' internal fans to spin up under load . Tests so far were brief - lasting less than 10 minutes in duration each .


    Render tests were performed under OS X 10.10.5 Yosemite and 10.11.3 El Capitan . Results were basically identical .


    Results :


    LuxMark V 2.1 Sala (Open CL) = 10,798 .


    Octane "Trench" Render Target PT (CUDA) = 24 seconds .


    Blender BMW Blenchmark scene (CUDA) = 29.36 seconds .



    System configuration :


    Mac Pro 4,1 > 5,1 (factory 2009)


    Factory internal 980 W PSU


    2.8 GHz Six Core X5660 CPU


    16 GB 1333 MHz memory (4 x 4GB)


    250 GB SSD (Samsung 850 EVO) or 1 TB HDD (WD Blue)


    DVD-RW drive


    OS X 10.10.5 or 10.11.3



    3 x EVGA GTX 980 Ti FTW video cards (@ $630 each) .


    3 x 3M Twin Axial PCIe x16 500mm extension cable (@ $100 each) .


    external 1000 W EVGA PSU (@ $160) .


    DimasTech Test Bench Frame ($135) .


    Total cost of array (not including host computer) is $2,485 , which is less than the cheapest brand new Cubix Chassis without any cards .





    GTX 980 Ti cards were installed in the following slots :


    Slot 1 (8 lanes electrical)


    Slot 2 (16 lanes)


    Slot 3 (4 lanes)


    Optional GT 120 EFI UI card was occasionally installed in Slot 4 (4 lanes) .


    An attempt was made to install a GTX 980 Ti card in each of the four Nehalem Mac Pro's PCIe slots via these cables . This attempt failed as the Mac refused to complete its POST . It simply went into a chronic loop trying to pass POST and did not crash , freeze or restart . There appears to be a firmware lock that limits these Macs to three Maxwell graphics cards , which would also explain why certain external expansion chassis have issues with recognizing additional installed graphics cards . It's not a chassis limitation, per se . It's a host computer limitation .

    I have not performed a burn in stress test yet, as I only today found an utility (LuxMark v 3) to recognize multiple GPUs . Valley , Heaven and FurMark cannot stress test an array of cards in Mac OS X . I will provide thermal and stability data as I collect it . It might be necessary to provide additional cooling when the array is under load for an extended period .


    Some additional last minute notes : rendering cards appear not to require more than 4 PCIe Rev 2 lanes each (electrical) , so the additional lanes provided by some of the slots are not necessary . This might prove useful with slot splitter cards in future configurations .


    Questions are welcome .


    I am Creation Machines .


    P2224505.JPG P2224506.JPG sala score JPEG.jpg octane trench score JPEG.jpg blender bmw score JPEG.jpg system profile GPU JPEG.jpg

     
  2. AidenShaw, Feb 23, 2016
    Last edited: Feb 23, 2016

    AidenShaw macrumors P6

    AidenShaw

    Joined:
    Feb 8, 2003
    Location:
    The Peninsula
    #2
    Doesn't it make you a bit sad that you could do all that inside a Z-series using the factory power supply - and have more RAM and much faster processors?

    Meanwhile, I'm awaiting delivery of three systems each with 72 cores/144 threads, 1 TiB of RAM (up to 6 TiB max if I need more) and five Titan-X per system. Each has five 1.6 TB NVMe SSD drives (8 TB of NVMe per system) and 5 TB of SAS drives.

    Apple just doesn't care anymore.
     
  3. Machines thread starter macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #3
    I have a HP Z800 in service as my personal Windows 7 gaming machine . It's an awesome rig and I might connect this GPU array to it and run a few tests .

    If you need some advice on how to rebuild this series of workstations , shoot me a message . Mine desperately needed it's IOH chipset re-thermal pasted when it first arrived here or it would have died by now the way I push my gear .

    But, anyways, commercial technicians are not permitted to load OS X onto a PC , the last time I checked . So , it is a non issue .

    Tests so far pushed 800 W peak , System wide, with the Mac Pro (factory internal + auxiliary external PSUs ) .
     
  4. DearthnVader macrumors regular

    DearthnVader

    Joined:
    Dec 17, 2015
    Location:
    Red Springs, NC
    #4
    That's a lot of GFX power.

    Too bad there's no SLI in OS X.
     
  5. Synchro3 macrumors 65816

    Synchro3

    Joined:
    Jan 12, 2014
    #5
    Where did you buy the heavily shielded PCIe cables?
     
  6. Machines thread starter macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #6
    Digi-Key .

    http://www.digikey.com/product-search/en?mpart=8KC3-0726-0500&v=19 .

    You might not need the X16 version . X4 may actually work as well as the X16 for this project . You definitely need the longest cables available as they are a bit too short as it is ...

    These extension cables are stiff, hard to work with and you'll think you are breaking them during installation . They will get a little roughed up due to rubbing against the Mac's PCIe slot area case shielding (thin, sharp metal edges .) But so far, they have held up for me .

    The shielded cables are used to prevent signal cross-talk and resist EMI . This improves performance .
     
  7. Synchro3, Feb 24, 2016
    Last edited: Feb 24, 2016

    Synchro3 macrumors 65816

    Synchro3

    Joined:
    Jan 12, 2014
    #7
  8. AidenShaw macrumors P6

    AidenShaw

    Joined:
    Feb 8, 2003
    Location:
    The Peninsula
    #9
  9. Machines, Feb 25, 2016
    Last edited: Feb 25, 2016

    Machines thread starter macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #10
    Highest end shipping BTO option for GPU rendering for the V3 HP Z840 models would be 2 x Quadro M6000 , for a total of 6144 Cuda Cores . Retail System price is 12 grand minimum .

    My GPU array configuration has a total of 8448 Cuda Cores . Array cost alone is below 2500 bucks .

    And yes, I know there is a difference between workstation and consumer nVidia cards .

    But in the Mac community creatives often use the consumer versions for rendering .

    My array just passed the 19 hour mark in its burn in stress test, running Luxmark 3 stress test option . All three GPUs are at load with no issues observed . A nice stable and powerful array .
     
  10. AidenShaw macrumors P6

    AidenShaw

    Joined:
    Feb 8, 2003
    Location:
    The Peninsula
    #11
    :D Retail price? Who pays that? For HP?

    We use consumer cards as well (machine learning and AI). Don't need FP64, and ECC isn't needed. GTX980Ti is sweet, although I've ordered three systems with five Titan-X in each.
     
  11. Machines, Feb 25, 2016
    Last edited: Feb 25, 2016

    Machines thread starter macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #12
    Array just completed successfully its 24 hour burn in without incident , using the LuxBall stress test function of Luxmark 3.x . All three cards performed admirably and benchmarked properly immediately after the completion of the stress test (when still hottest) . GPU heatsink thermals were 53 C (Slot 3) , 82 C (slot 2 and center card of array) and 58 C (slot 1) . Decided to go with an active cool over the array , but the small USB fan didn't have sufficient airflow . System power consumption (internal factory PSU plus external GPU PSU) was at 725 W .

    24 hour burn in luxmark 3 stress test 3 x 980 Ti Jpeg.jpg
     
  12. shaunp macrumors 65816

    Joined:
    Nov 5, 2010
    #13
    Interesting project you have going there mate, nice one
     
  13. Machines, Feb 26, 2016
    Last edited: Feb 26, 2016

    Machines thread starter macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #14
    It appears the three GTX 980 Ti array has achieved three record Mac OS X scores , compared to the best Barefeats test results :

    Luxmark v 2.1 Sala Open CL rendering score (higher is better) :

    3 x GTX 980 Ti = 10,798
    BF's Mac Pro Nehalem with dual GTX 980 = 4,960
    BF's Mac Pro Nehalem with dual R9 290x = 5,388
    BF's Mac Pro Cylinder with dual D700 = 3,771



    Blender BMW picture render (CUDA or Open CL) in seconds to completion :

    3 x GTX 980 Ti = 29.36
    BF's Mac Pro Nehalem with dual GTX 980 = 45
    BF's Mac Pro Nehalem with dual AMD 7950 = 124
    BF's Mac Pro Cylinder with dual D700 = 87



    Octane "Trench" RenderTarget PT test (CUDA) , seconds to completion :

    3 x GTX 980 Ti = 24
    BF's Mac Pro Nehalem with dual GTX 980 = 51
    BF's Mac Pro Nehalem with 5 x GPU (Cubix) with one GTX 680 , Dual GTX 580 and Dual GTX 770 = 34
     
  14. Earl Urley macrumors regular

    Joined:
    Nov 10, 2014
    #15
    Thought I heard a prolonged scream from the general direction of Hawaii, must be Rob-Art realizing he no longer has the fastest Mac Pro in existence.

    Great job on this mod, and thanks for posting here to tell us.. it can be done!
     
  15. Machines, Feb 26, 2016
    Last edited: Feb 26, 2016

    Machines thread starter macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #16
    I have utmost respect for Rob-Art of Barefeats and his tireless support for Mac performance reporting . It's not a competition but an attempt to push our hardware to the very limit and "thinking outside of the box." It's also about careful documentation so results can be replicated by a wider audience . And , to a lesser extent , it's a plea to Apple to improve their act with the Mac workstation product line . Apple is losing a lot of business to HP, Dell , Boxx , etc and Creatives are jumping ship to Windows and Linux . Apple forgot its roots , which folk like me still remember like it was yesterday .
     
  16. MacVidCards Suspended

    Joined:
    Nov 17, 2008
    Location:
    Hollywood, CA
  17. AidenShaw macrumors P6

    AidenShaw

    Joined:
    Feb 8, 2003
    Location:
    The Peninsula
    #18
    He certainly was there for a long time (search for "Hawaii" or "Honolulu" at barefeats).
     
  18. Machines thread starter macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #19
    As long as you are here MVC , do you have any comments on the upper limit on the number of Maxwell GPUs that can be installed concurrently with the classic Mac Pros ? I think it's three but I wonder what it would be if I split one of the host PCIe interfaces . In a wishfully thinking moment I wished for ten GPUs System wide , silly me .
     
  19. AidenShaw macrumors P6

    AidenShaw

    Joined:
    Feb 8, 2003
    Location:
    The Peninsula
    #20
    If you need ten GPUs, I have a feeling that you should think about whether your needs are beyond what the Apple ecosystem can support.

    If you want advice about Linux or Windows systems that support ten GPUs, just ask. (Although five Titan-X cards per system is as far as I've gone so far.)
     
  20. \-V-/ Suspended

    \-V-/

    Joined:
    May 3, 2012
  21. MH01 macrumors G4

    MH01

    Joined:
    Feb 11, 2008
    #23
    Just Wanted to says, that's awesome and thanks for sharing
     
  22. Machines thread starter macrumors 6502

    Machines

    Joined:
    Jan 23, 2015
    Location:
    Fox River Valley , Illinois
    #24
    Thanks . I tried my best to push the project as far as I could , since one of my local clients expressed a desire for even greater rendering performance . I could no longer work my magic entirely internally within the Mac .

    And also over the years , I was puzzled to hear reports that Mac Cubix users were reporting severe limits on getting their GPUs recognized for reasons unknown .

    It now appears to be a host computer firmware related issue .
     
  23. Chung123 macrumors regular

    Chung123

    Joined:
    Dec 5, 2013
    Location:
    NYC
    #25

Share This Page