Link Speed impact on CUDA processing

Discussion in 'Mac Pro' started by sunsetsothickly, Sep 4, 2013.

  1. sunsetsothickly macrumors member

    May 21, 2005
    Hi guys,

    I have been thinking about a 780 running headless, purely for speeding up processing of a couple of CUDA enabled filters that I use regularly.

    I am limited as to whom I can purchase hardware from, so buying a flashed 780 from MVC, for example, isn't an option.

    One of my two most frequently used filters only runs in Windows. I run it in Bootcamp using my Mac version 680 (thus 5GT Link Speed) as the CUDA device.

    If I were to run a headless 780 as a second card (I am aware of the power issues), how much is the 2.5 Link Speed going to impact CUDA computes in Bootcamp? Am I right in thinking that because people run multiple cards in an expansion chassis connected by a single x16 HBA, that this throughput is less important in GPU computing situations?

    Thanks for your help guys.
  2. xcodeSyn macrumors 6502a

    Nov 25, 2012
    Running GPGPU work at 2.5GT/s link speed would probably slow down a bit than in 5.0GT/s, but by how much is much harder to tell without actual samples. This post done by MacVidCards may provide a little more insight to your question. He ran the CUDA rendering test using the Titan in x16 (PCIe 2.0) bus for 245 seconds and in x4 (PCIe 1.1) which is equivalent to x2 (PCIe2.0) bus for 285 seconds. So from x16 to x2, the performance dropped around 16%. Since x16 (PCIe 1.1) is equivalent to x8 (PCIe 2.0), using a linear approximation you get about 9% performance drop, 16% * 8 / (16-2). This may not be close to reality at all, but gives you a rough idea of possible performance drop in a given CUDA test. I am not even sure the performance drop would be linear. Maybe you can ask MVC to run some tests relevant to your specific questions when he comes back from his time-out.
  3. sunsetsothickly thread starter macrumors member

    May 21, 2005
    Thanks xcodeSyn - if it stands up to testing, that 9% performance drop is something I could live with. I'd just want to be sure that after everything is said and done, a 780 running at 2.5GT/s would give me as much CUDA compute filter performance as a 680 running at 5GT/s (with the 'mac edition'), so that I am not losing out in that half of the workflow. If MVC is watching and has the answer, give us a sign!

Share This Page