MP 7,1 New Barefeats multi GPU findings

chfilm

macrumors 68030
Original poster
Nov 15, 2012
2,834
1,606
Berlin
Hey,

don‘t know if you guys saw THIS yet, some very interesting results here.

It appears to me that the infinity fabric basically brings (almost) nothing to the table, if we look at the results with dual GPUs.
Personally I would love to add a 5700 to my VEGA II (mostly because of the newer video encoder and display compression), either when Apple comes out with it at a reasonable price point, or to just buy a third party one. So far I was wondering if a second VEGA II from Apple would be much better because of the Infinity fabric over a third party VII, but it appears the Vega II DUO is even a bit slower than the combo with a third party card. So strange.
Adding a third party VII would be a nice option if the fan/sleep issue wasn‘t one...
 

Macinsquatch

macrumors member
Mar 28, 2015
76
21
That is an interesting article. It's hard to tell if this is the result of 3rd party developers not taking advantage of the Metal 2 API for creating Peer Groups introduced in 2017 to use multiple GPUs for compute or if there is something in the OS preventing use of the display GPU to also participate in Compute workloads. It would be nice to see Metal adopt the Heterogeneous Explicit Multiple GPU functionality similar to DirectX 12 and Vulkan 1.1. Given the push for external GPUs in the apple ecosystem this would be very helpful for mixing and matching (although primarily for games).
 
  • Like
Reactions: chfilm

repoman27

macrumors 6502
May 13, 2011
435
77
Hey,

don‘t know if you guys saw THIS yet, some very interesting results here.

It appears to me that the infinity fabric basically brings (almost) nothing to the table, if we look at the results with dual GPUs.
Personally I would love to add a 5700 to my VEGA II (mostly because of the newer video encoder and display compression), either when Apple comes out with it at a reasonable price point, or to just buy a third party one. So far I was wondering if a second VEGA II from Apple would be much better because of the Infinity fabric over a third party VII, but it appears the Vega II DUO is even a bit slower than the combo with a third party card. So strange.
Adding a third party VII would be a nice option if the fan/sleep issue wasn‘t one...
Don't forget that the Vega II Duo only has a single PCIe Gen3 x16 link to the CPU which is being shared by the two GPUs via an additional on-card PCIe switch (so half the PCIe bandwidth plus additional latency). The Vega 20 CPUs in the Vega II Duo MPX module also have a slightly lower boost clock than the Radeon VII (1725 MHz vs. 1750 MHz), and lower TDP (>500 W for both CPUs, 48-lane PCIe switch, and 2x Thunderbolt 3 controllers, vs. 300 W TBP for single GPU Radeon VII).

So it makes sense that a pair of Radeon VIIs outperforms a Radeon Pro Vega II Duo MPX module on GPU benchmarks and trounces it on cost, but if you look at power and noise, the picture is very different. Apple couldn't just put four Radeon VIIs in the Mac Pro and have it be nearly silent while staying under 1400 W (which is essentially as much as you can draw from a NEMA 5-15R power outlet in North America).
 

chfilm

macrumors 68030
Original poster
Nov 15, 2012
2,834
1,606
Berlin
That is an interesting article. It's hard to tell if this is the result of 3rd party developers not taking advantage of the Metal 2 API for creating Peer Groups introduced in 2017 to use multiple GPUs for compute or if there is something in the OS preventing use of the display GPU to also participate in Compute workloads. It would be nice to see Metal adopt the Heterogeneous Explicit Multiple GPU functionality similar to DirectX 12 and Vulkan 1.1. Given the push for external GPUs in the apple ecosystem this would be very helpful for mixing and matching (although primarily for games).
yea, it's so interesting. I wish he would have a second Vega II to really test the infinity fabric - but I guess a single Vega II Duo should perform about the same.

I doubt that Davinci is not taking full care of whatever Mac Os gives them to play with, Black magic seems to be super closely involved with Apple, since they were the first ones also to support Afterburner in Resolve.

I guess he'll also have to revisit these tests when the official 5700 MPX module version comes out with proper drivers.
- - Post merged: - -

Don't forget that the Vega II Duo only has a single PCIe Gen3 x16 link to the CPU which is being shared by the two GPUs via an additional on-card PCIe switch (so half the PCIe bandwidth plus additional latency). The Vega 20 CPUs in the Vega II Duo MPX module also have a slightly lower boost clock than the Radeon VII (1725 MHz vs. 1750 MHz), and lower TDP (>500 W for both CPUs, 48-lane PCIe switch, and 2x Thunderbolt 3 controllers, vs. 300 W TBP for single GPU Radeon VII).

So it makes sense that a pair of Radeon VIIs outperforms a Radeon Pro Vega II Duo MPX module on GPU benchmarks and trounces it on cost, but if you look at power and noise, the picture is very different. Apple couldn't just put four Radeon VIIs in the Mac Pro and have it be nearly silent while staying under 1400 W (which is essentially as much as you can draw from a NEMA 5-15R power outlet in North America).
I see, you're right!
Now if only that stupid sleep issue wasn't present with the VII...

Do you guys think the newer architecture and video encoder in the 5700 will give it some sort of edge over the VII? Or is the older card just more bang for the buck?

I'm actually not sure if my Mac ever even enters sleep mode to be honest. I think I disabled it because of monitor sleep wake issues and some problems with my PSU. How do I determine if the whole mac is asleep or just the screens go black?
 
Last edited:

Macinsquatch

macrumors member
Mar 28, 2015
76
21
It reminds me of the odd way my Trashcan handles compute workloads in MacOS. Only the second D700 is available, while the primary is stuck on display duties. Rebooting into Windows allows me to enable Crossfire and harness both of them for workloads. Not a lot of software to test this but Photoscan Pro is available in windows and MacOS and was consistently faster in Windows. This may have changed with the advent of Metal 2, but in 2016 when I last tested it back to back I could only use one GPU in MacOS, while both were available for Compute workloads in Windows.

I haven't kept up with driver availability in Windows for these new AMD cards but I would be interested in seeing how compute workloads compare from MacOS to Windows.
 

jasonmvp

macrumors demi-god
Jun 15, 2015
310
249
Northern VA
Do you guys think the newer architecture and video encoder in the 5700 will give it some sort of edge over the VII? Or is the older card just more bang for the buck?
I can only speculate based on the specs Apple has published on the cards. According to Apple, in raw compute, the W5700X card should be ~66% of the Vega II single GPU card, but have half the VRAM and it's a bit slower as well. That would suggest that in rendering, the W5700X may perform slightly slower. In encoding, however, it'll likely slap the Vega II like a red-headed step-child.
 
  • Haha
Reactions: chfilm

repoman27

macrumors 6502
May 13, 2011
435
77
It reminds me of the odd way my Trashcan handles compute workloads in MacOS. Only the second D700 is available, while the primary is stuck on display duties. Rebooting into Windows allows me to enable Crossfire and harness both of them for workloads. Not a lot of software to test this but Photoscan Pro is available in windows and MacOS and was consistently faster in Windows. This may have changed with the advent of Metal 2, but in 2016 when I last tested it back to back I could only use one GPU in MacOS, while both were available for Compute workloads in Windows.

I haven't kept up with driver availability in Windows for these new AMD cards but I would be interested in seeing how compute workloads compare from MacOS to Windows.
I'm not sure you're interpreting those benchmark results correctly. The scaling isn't linear, but it's close:

Test 1:
1 GPU = 9.8
2 GPUs = 5.4
3 GPUs = 4.0
4 GPUs = 3.0

Test 2:
1 GPU = 3.1
2 GPUs = 1.8
3 GPUs = 1.4
4 GPUs = 1.2

The "Vega II" in this test is actually a Radeon Pro Vega II Duo MPX module with only 1 GPU active:
GRAPH LEGEND

Vega II Duo
= AMD Radeon Pro Vega II Duo GPU module (32GB of HBM2 memory each)
VII 'Duo' = two AMD Radeon VII GPUs (16GB of HBM2 memory each)
5700 XT 'Duo' = two AMD Radeon RX 5700 XT GPUs (8GB of GDDR6 memory each)
Vega II = AMD Radeon Pro Vega II GPU Duo GPU module but only one Vega II active (32GB of HBM2 memory)
VII = AMD Radeon VII GPU (16GB of HBM2 memory)
5700 XT = AMD Radeon RX 5700 XT GPUs (8GB of GDDR6 memory)
In fact, the Vega II Duo wins every time in these benchmarks, except for where the Vega II + VII 'Duo' edges out the Vega II Duo + VII by 6 seconds.
- - Post merged: - -

I'm actually not sure if my Mac ever even enters sleep mode to be honest. I think I disabled it because of monitor sleep wake issues and some problems with my PSU. How do I determine if the whole mac is asleep or just the screens go black?
You can try the pmset command in terminal:

Bash:
pmset -g log|grep -e " Sleep " -e " Wake " -e " DarkWake "
 

chfilm

macrumors 68030
Original poster
Nov 15, 2012
2,834
1,606
Berlin
I'm not sure you're interpreting those benchmark results correctly. The scaling isn't linear, but it's close:

Test 1:
1 GPU = 9.8
2 GPUs = 5.4
3 GPUs = 4.0
4 GPUs = 3.0

Test 2:
1 GPU = 3.1
2 GPUs = 1.8
3 GPUs = 1.4
4 GPUs = 1.2

The "Vega II" in this test is actually a Radeon Pro Vega II Duo MPX module with only 1 GPU active:

In fact, the Vega II Duo wins every time in these benchmarks, except for where the Vega II + VII 'Duo' edges out the Vega II Duo + VII by 6 seconds.
- - Post merged: - -


You can try the pmset command in terminal:

Bash:
pmset -g log|grep -e " Sleep " -e " Wake " -e " DarkWake "
Thx, what does this command do?
- - Post merged: - -

In encoding, however, it'll likely slap the Vega II like a red-headed step-child.
:D
But that will only be in single pass encoding, like it used to be with quicksync, right? So the only scenario in which that card will be superior is gonna be in quick playouts for clients in between... not sure if I have to have the edge here over the overall weaker compute power of a second Vega II or VII...

Still, even in light of all these tests, it remains ab it of a mystery to me, what actual advantage the infinity fabric brings to the table if you have two APPLE Vega II over any other combo.
 

repoman27

macrumors 6502
May 13, 2011
435
77
Thx, what does this command do?
From the man page:
pmset manages power management settings such as idle sleep timing, wake on administrative access, automatic restart on power loss, etc.
...
-g log displays a history of sleeps, wakes, and other power management events. This log is for admin & debugging purposes.
The rest of the command pipes the output to grep which trims it down to just entries corresponding to sleep and wake events.
 

jasonmvp

macrumors demi-god
Jun 15, 2015
310
249
Northern VA
But that will only be in single pass encoding, like it used to be with quicksync, right? So the only scenario in which that card will be superior is gonna be in quick playouts for clients in between...
That's correct: single pass encoding. And while that may not help you as much, for those that are primarily producing YouTube output it'll be a help. Crank the bit rate up, and go to town. The hardware encoded output will be perfect for that venue.
 
  • Like
Reactions: chfilm

chfilm

macrumors 68030
Original poster
Nov 15, 2012
2,834
1,606
Berlin
From the man page:

The rest of the command pipes the output to grep which trims it down to just entries corresponding to sleep and wake events.
uuuhhhh sorry, I was looking into what the script put out to me and I understand absolutely nothing. What should I type in to see a log of what the system did? I tried pmset -g but it just gives me the list of options.. :/ apologies for my lack of comprehension.
- - Post merged: - -

That's correct: single pass encoding. And while that may not help you as much, for those that are primarily producing YouTube output it'll be a help. Crank the bit rate up, and go to town. The hardware encoded output will be perfect for that venue.
Right - personally I'm always debating in such scenarios (my previews are usually MAXIMUM 12 minutes long, much more frequently in the 45-90 sec range) wether the lost speed during upload outweighs the gains during render performance. I guess we have to wait for benchmarks. Hardware encoding on the Vega II in single pass is already a huge gain over what the trashcan was able to do..
 

repoman27

macrumors 6502
May 13, 2011
435
77
uuuhhhh sorry, I was looking into what the script put out to me and I understand absolutely nothing. What should I type in to see a log of what the system did? I tried pmset -g but it just gives me the list of options.. :/ apologies for my lack of comprehension.
- - Post merged: - -


Right - personally I'm always debating in such scenarios (my previews are usually MAXIMUM 12 minutes long, much more frequently in the 45-90 sec range) wether the lost speed during upload outweighs the gains during render performance. I guess we have to wait for benchmarks. Hardware encoding on the Vega II in single pass is already a huge gain over what the trashcan was able to do..
And this is what I get for copy and pasting from StackExchange without testing the command myself first. Try this instead:
Bash:
pmset -g log|grep -e '0 Sleep  ' -e '0 Wake  ' -e '0 DarkWake  '
That lists just the sleep and wake events with timestamps and reasons (for me at least).

The basic command is "pmset -g log" though. You need the log after -g.
 

Pressure

macrumors 601
May 30, 2006
4,076
308
Denmark
Apple couldn't just put four Radeon VIIs in the Mac Pro and have it be nearly silent while staying under 1400 W (which is essentially as much as you can draw from a NEMA 5-15R power outlet in North America).
To make matters worse the PSU can only output 1280W at peak on 110V.
 

goMac

macrumors 604
Apr 15, 2004
7,145
1,166
That is an interesting article. It's hard to tell if this is the result of 3rd party developers not taking advantage of the Metal 2 API for creating Peer Groups introduced in 2017 to use multiple GPUs for compute or if there is something in the OS preventing use of the display GPU to also participate in Compute workloads. It would be nice to see Metal adopt the Heterogeneous Explicit Multiple GPU functionality similar to DirectX 12 and Vulkan 1.1. Given the push for external GPUs in the apple ecosystem this would be very helpful for mixing and matching (although primarily for games).
Right. Software has to specifically be written for Infinity Fabric.

If it's not, it'll act exact the same. Haven't heard anything about Resolve doing a Mac Pro specific update like Pixelmator did.
 

chfilm

macrumors 68030
Original poster
Nov 15, 2012
2,834
1,606
Berlin
Right. Software has to specifically be written for Infinity Fabric.

If it's not, it'll act exact the same. Haven't heard anything about Resolve doing a Mac Pro specific update like Pixelmator did.
Hm ok I wasn’t aware of that- so maybe we should give it some time and wait and see how big of an impact it’s gonna have. FCP seems to be the only software taking advantage of this setup so far..
 

repoman27

macrumors 6502
May 13, 2011
435
77
To make matters worse the PSU can only output 1280W at peak on 110V.
Right, Apple designed it to output as much power as possible given a typical 15 A circuit.

The NEMA 5-15R receptacle that this machine can be plugged into is only rated for 15 A @ 125 V. That makes the maximum nameplate rating for connected devices 15 x 125 = 1875 W, which is what you might find on a high-power hair dryer. However, in the US, the nominal mains voltage is 120 V with an allowable range of +5% to -10% at a branch outlet, or between 108 and 125 V. Continuous power draw (3 hours or more is considered continuous in this case) should not exceed 12 A (80% of the maximum) otherwise the circuit breaker will trip as the wires slowly heat up due to the their inherent impedance. Making a PC that draws more than 12A would therefore be pretty poor form. However, the PSU isn't 100% efficient either. Due to conversion losses, it can only output around ~92% of what it draws from the wall, and the voltage measured at the outlet may well be lower than 125 V. So basically the PSU will follow a curve to stay at or under 12 A and prevent your circuit from tripping.

Bear in mind that most domestic space heaters sold in North America only draw 1000 W. And of course all of this is irrelevant in regions with mains voltages higher than 125 V, but is even more poignant in Japan which only has 100 V mains.
 

bsbeamer

macrumors 68040
Sep 19, 2012
3,654
1,930
More from Bare Feats:


Very Interesting.

Lou

Actually found this part the most interesting:

"The really big news is that the non-Apple supplied GPUs are a fast and affordable alternative to the Apple GPUs like the Pro Vega II. But there are a few downsides. First, they don't provide an Apple Startup or Boot screen. That's not an issue if you are just adding them to the Apple factory GPU in your 2019 Mac Pro."

This counters some very early reports in this forum.

The RX 5700 XT is a great GPU and will be better when Apple (and AMD?) fixes the drivers in 10.15.3+. Many issues with METAL that are repeatable, even with 5500M in MBP16,1.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.