Link Speed impact on CUDA processing

sunsetsothickly · Sep 4, 2013

Hi guys,

I have been thinking about a 780 running headless, purely for speeding up processing of a couple of CUDA enabled filters that I use regularly.

I am limited as to whom I can purchase hardware from, so buying a flashed 780 from MVC, for example, isn't an option.

One of my two most frequently used filters only runs in Windows. I run it in Bootcamp using my Mac version 680 (thus 5GT Link Speed) as the CUDA device.

If I were to run a headless 780 as a second card (I am aware of the power issues), how much is the 2.5 Link Speed going to impact CUDA computes in Bootcamp? Am I right in thinking that because people run multiple cards in an expansion chassis connected by a single x16 HBA, that this throughput is less important in GPU computing situations?

Thanks for your help guys.

xcodeSyn · Sep 4, 2013

sunsetsothickly said:
If I were to run a headless 780 as a second card (I am aware of the power issues), how much is the 2.5 Link Speed going to impact CUDA computes in Bootcamp? Am I right in thinking that because people run multiple cards in an expansion chassis connected by a single x16 HBA, that this throughput is less important in GPU computing situations?

Running GPGPU work at 2.5GT/s link speed would probably slow down a bit than in 5.0GT/s, but by how much is much harder to tell without actual samples. This post done by MacVidCards may provide a little more insight to your question. He ran the CUDA rendering test using the Titan in x16 (PCIe 2.0) bus for 245 seconds and in x4 (PCIe 1.1) which is equivalent to x2 (PCIe2.0) bus for 285 seconds. So from x16 to x2, the performance dropped around 16%. Since x16 (PCIe 1.1) is equivalent to x8 (PCIe 2.0), using a linear approximation you get about 9% performance drop, 16% * 8 / (16-2). This may not be close to reality at all, but gives you a rough idea of possible performance drop in a given CUDA test. I am not even sure the performance drop would be linear. Maybe you can ask MVC to run some tests relevant to your specific questions when he comes back from his time-out.

sunsetsothickly · Sep 4, 2013

xcodeSyn said:
Since x16 (PCIe 1.1) is equivalent to x8 (PCIe 2.0), using a linear approximation you get about 9% performance drop, 16% * 8 / (16-2). This may not be close to reality at all, but gives you a rough idea of possible performance drop in a given CUDA test. I am not even sure the performance drop would be linear. Maybe you can ask MVC to run some tests relevant to your specific questions when he comes back from his time-out.

Thanks xcodeSyn - if it stands up to testing, that 9% performance drop is something I could live with. I'd just want to be sure that after everything is said and done, a 780 running at 2.5GT/s would give me as much CUDA compute filter performance as a 680 running at 5GT/s (with the 'mac edition'), so that I am not losing out in that half of the workflow. If MVC is watching and has the answer, give us a sign!

Search

Search

Link Speed impact on CUDA processing

sunsetsothickly

macrumors member

xcodeSyn

macrumors 6502a

sunsetsothickly

macrumors member

Our Staff