Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Thanks for the detailed answer. So when do you think mobile graphics chips will get a die shrink, or at least, one that will make it to the iMac? Are nvidia mobile chips comparable to these amd chips in terms of speed and nm etc? I keep hearing about this 990m...

As others have pointed out on these forums, it's not the speed of
Could you speculate on what kind of gains we could see in the new mobile GPUs? And why do people keep saying nvidia are ahead of amd in GPUs? I googled the 990M but couldn't find much info, is it even out? In other words, how does the 395X compare to nvidias latest available offering?

Just found this statement via a Google search

Nvidia can't support the 5K resolution yet, hence going AMD. Plus they probably gave Apple a good deal.
 
i already have M295x so i will wait for next year, when i hope apple and nvidia will manage to put the 990M into the 27" one as BTO with a cooling system that can provide at least 90% of its power under full load
 
Could you speculate on what kind of gains we could see in the new mobile GPUs?
This is a complicated question. Warning: rough numbers and lots of assumptions start here.

In logic systems, when work is performed heat is created. You can think of TDP (Thermal Design Point) is a budget for how much heat the particular chassis can dissipate. Therefore we can directly place a cap on the amount of computational work which can be performed in a chassis. Both lithography and architecture changes allow this fundamental to be altered. As the machine grows more efficient you can do more work while producing less heat. Thusly the chassis can do more work inside the same TDP.

Looking back at the last GPU shrink (40nm to 28nm) might allow us to infer some figures. Nvidia went from the GTX 580 to the GTX 680 and AMD went from the HD 6970 to the HD 7970. These are the easiest cards to get data on because they are flagship cards of the era.

On average performance increased by 15% and simultaneously power consumption decreased by 15%. You can't attribute everything here to a shrink. AMD released GCN and the 580 was a particularly inefficient card. Mobile GPUs gain more from shrinks than desktop GPUs. In the desktop you can usually dissipate the additional heat you create, it doesn't slow you down. In mobile the additional heat just causes more throttling.

We can make an educated guess about the iMac's TDP by consulting this article. At 240W max TDP the iMac is dancing around its thermal limits under a gaming load. Assuming 50W idle not including CPU or GPU. Add 50W for CPU load (i7, not full load) and 125W for GPU load (M395X, full load). Add another 30W for interconnects, wireless, drive access, memory, audio, and so forth. That's 255W, beyond the TDPmax by a not insignificant amount. The iMac will be throttling at this time to limit the heat output and power consumption of the machine.

What if your GPU was 10% faster and 10% less power consumptive? Your 110W GPU will no longer cause throttling (hopefully) and so therefore will be 10% faster base. Plus add 12% because the old GPU had to run 15W down and do less work because of throttling. Call the new GPU 20% faster all else being equal.

That's my incredibly rough speculation and I'm sticking to it. Next year's die shrink iMac GPU will be at least 15-20% faster in the same enclosure. 50fps today is 60fps tomorrow.

Make note that this illustrates why PC towers are so much faster than iMacs for gaming. My PC tower has a 250W GPU; that is one card with the same TDP as an entire iMac computer.


Nvidia can't support the 5K resolution yet...
Do you have a reference for this information? From my understanding Nvidia supports DP1.2 and MST.
 
  • Like
Reactions: mattoligy
Your instinct is correct. Here's the problem:...GPUs however still remain stuck on TSMC 28nm....Normally your best bet is to wait on shrinks to upgrade...the iMac is not a traditional desktop. As soon as you begin to put it under load, it throttles. This is why we don't see much improvement between the high end models in successive generations....if you can afford to wait, then waiting is a good idea. Don't believe what people tell you about waiting forever. There are points of inflection in computer progress where buying is of much greater value. Node shrinks are generally one of those times.

The problem is we don't know when the 20nm or smaller nodes will be available in quantity for high-power (not mobile) GPUs, nor what the real-world performance benefit will be. There are ominous fabrication problems, different from past generations:

"Waiting on 20nm graphics cards? Don't Bother": http://techsoda.com/no-20nm-graphics-amd-nvidia/

"450nm Wafers and Death of Moore's Law": http://www.extremetech.com/computin...y-450mm-wafers-halted-and-no-path-beyond-14nm

Re the new iMac throttling "As soon as you begin to put it under load", the initial tests show this doesn't happen as it previously did:
 
  • Like
Reactions: Wahlstrm
Any old MacPro can drive a 5K screen starting with Kepler-based GPUs, only requirement are dual DP ports (e.g. GTX 760). And yes, this works in OS X.
 
The problem is we don't know when the 20nm or smaller nodes will be available in quantity for high-power (not mobile) GPUs, nor what the real-world performance benefit will be. There are ominous fabrication problems, different from past generations:

"Waiting on 20nm graphics cards? Don't Bother": http://techsoda.com/no-20nm-graphics-amd-nvidia/

"450nm Wafers and Death of Moore's Law": http://www.extremetech.com/computin...y-450mm-wafers-halted-and-no-path-beyond-14nm

Re the new iMac throttling "As soon as you begin to put it under load", the initial tests show this doesn't happen as it previously did:
I agree with everything you are saying here about the lithography. Except all the articles you link to are old. TSMC 16nm and Sam/GloFo 14nm did pose fabrication problems and were considerably delayed. They are also not providing a true shrink as stated. Different parts are shrunk by different amounts, some less and some more.

However, several years on, we are now at the point where both of those processes have entered volume production. We know that Nvidia at least has taped out Pascal on 16nm, I would be surprised if AMD hadn't done the same. These chips are coming, and the performance improvements will be non-trivial.

Regarding the video, which I did watch and I must say it was quite well produced. Unfortunately I don't think what it presents makes total sense. Apple is cramming just as much power as before into the same chassis. Something doesn't add up. I need to know what tools he is using to capture the data. Generally speaking you have to perform these tests under Windows because OS X doesn't give you a good enough look at the hardware metrics.

DP1.2 only does 4K, not 5K of the 27" iMac
Correct. DP1.2+MST is exactly what Tonga (M295X) had in the 2014 5K model and somehow they made this work. All evidence at this time points to them breaking the eDP standard and overclocking a 1.2 stream as well as adding some more pins. There is absolutely no reason they cannot do this with Nvidia.

M395X is still Tonga, so it doesn't support eDP1.3 as far as we know. This means Apple is once again performing some kind of hack on their AMD GPUs to get the 5K iMac to work. No reason they can't do this with Nvidia if they want to.
 
  • Like
Reactions: 762999
Cobbled together from barefeats tests done previously, my own tests (I did run furmark on my m290x), and so on. Apparently, no one cared about the m290 machine, which is understandable. The Trex offscreen is a little strange, but I don't think it makes much difference. Note that some of these tests involve the CPU.

View attachment 596619
So t
This is a complicated question. Warning: rough numbers and lots of assumptions start here.

In logic systems, when work is performed heat is created. You can think of TDP (Thermal Design Point) is a budget for how much heat the particular chassis can dissipate. Therefore we can directly place a cap on the amount of computational work which can be performed in a chassis. Both lithography and architecture changes allow this fundamental to be altered. As the machine grows more efficient you can do more work while producing less heat. Thusly the chassis can do more work inside the same TDP.

Looking back at the last GPU shrink (40nm to 28nm) might allow us to infer some figures. Nvidia went from the GTX 580 to the GTX 680 and AMD went from the HD 6970 to the HD 7970. These are the easiest cards to get data on because they are flagship cards of the era.

On average performance increased by 15% and simultaneously power consumption decreased by 15%. You can't attribute everything here to a shrink. AMD released GCN and the 580 was a particularly inefficient card. Mobile GPUs gain more from shrinks than desktop GPUs. In the desktop you can usually dissipate the additional heat you create, it doesn't slow you down. In mobile the additional heat just causes more throttling.

We can make an educated guess about the iMac's TDP by consulting this article. At 240W max TDP the iMac is dancing around its thermal limits under a gaming load. Assuming 50W idle not including CPU or GPU. Add 50W for CPU load (i7, not full load) and 125W for GPU load (M395X, full load). Add another 30W for interconnects, wireless, drive access, memory, audio, and so forth. That's 255W, beyond the TDPmax by a not insignificant amount. The iMac will be throttling at this time to limit the heat output and power consumption of the machine.

What if your GPU was 10% faster and 10% less power consumptive? Your 110W GPU will no longer cause throttling (hopefully) and so therefore will be 10% faster base. Plus add 12% because the old GPU had to run 15W down and do less work because of throttling. Call the new GPU 20% faster all else being equal.

That's my incredibly rough speculation and I'm sticking to it. Next year's die shrink iMac GPU will be at least 15-20% faster in the same enclosure. 50fps today is 60fps tomorrow.

Make note that this illustrates why PC towers are so much faster than iMacs for gaming. My PC tower has a 250W GPU; that is one card with the same TDP as an entire iMac computer.



Do you have a reference for this information? From my understanding Nvidia supports DP1.2 and MST.

You forget that Nvidia already has a 28nm GTX 980M that's about 50% faster than the m395x in gaming and uses about the same 100-125 watt TDP. So by going to 16nm and adding the speed gains on top we stand to get a huge boost in performance of around 60+% which is much better than the crumbs we've been getting for 3 years. Of course if AMD may swing back and make a much faster HBM equipped 16nm "m495x" next year too!
 
You're welcome. Thanks for the sacrifice so we could get the benchmarks while you had it!
I still think the 395X is overrated though, look @jerwin's table from barefeats' benchmarks. The 395 is the sweet spot.
It's what I'd get if I were getting an riMac (going to try wait out for Thunderbolt 3).
But if you have the money, of course go for it!
Thanks, I chose the m395x for the 4gb of ram. That's the main reason. I realized the 5k display is using 1gb just for the display. Hopefully the imac will last 5 years or so.
 
Do you mind if I add my result from my iMac which just arrived this morning?

i5/M395x/512 SSD

CineBench: 99.88
 
Thanks, I chose the m395x for the 4gb of ram. That's the main reason. I realized the 5k display is using 1gb just for the display. Hopefully the imac will last 5 years or so.

Hi Huffy, that is indeed interesting and relevant info, could you paste a screen shot with that info please. Thanks

Also if possible a screen of FCPX or other similar app with a 1080p file on the timeline so that we can check how much vram is being used by the screen + a video editing app
 
Apple has a custom timing controller in the imac that lets it fuse two displayPort 1.2 streams. If it had wanted to, it probably could have implemented a Nvidia 965GTXWSi in the imac.
 
Hi Huffy, that is indeed interesting and relevant info, could you paste a screen shot with that info please. Thanks

Also if possible a screen of FCPX or other similar app with a 1080p file on the timeline so that we can check how much vram is being used by the screen + a video editing app

I run final cut pro x with a 1080p video playing while another 4k video was running with quicktime at the same time.


 
I run final cut pro x with a 1080p video playing while another 4k video was running with quicktime at the same time.



That is great info :) . Do you remember how much was used without the 4K video playback. If possible a screen-shot (hope not asking too much)

Useful for me and those that are between the choice of the 395 and 395x :) Thanks
 
Looks like FCPX doesnt use that much vram, at least on 1080p which is what i use so looks good...
It's not merely playing video in FCP X that uses VRAM, it is applying and rendering effects.

A much better GPU test within FCP X is the BruceX benchmark. It is extremely simple to run. You just import the XML, double-click in the event browser to put it on the timeline and hit the share button to export a 5k video file. Share>Master File>Settings
Format: Video & Audio
Codec: ProRes 422
Resolution: 5120x2700

Instructions and test file are here:

http://blog.alex4d.com/2013/10/30/brucex-a-new-fcpx-benchmark/
 
  • Like
Reactions: jerwin
It's not merely playing video in FCP X that uses VRAM, it is applying and rendering effects.

A much better GPU test within FCP X is the BruceX benchmark. It is extremely simple to run. You just import the XML, double-click in the event browser to put it on the timeline and hit the share button to export a 5k video file. Share>Master File>Settings
Format: Video & Audio
Codec: ProRes 422
Resolution: 5120x2700

Instructions and test file are here:

http://blog.alex4d.com/2013/10/30/brucex-a-new-fcpx-benchmark/


I did the test 3 times in a row and I got an average of 31 seconds with the ProRes 422. Doesn't look too bad to be honest.
 
Last edited:
  • Like
Reactions: robertojorge
It's not merely playing video in FCP X that uses VRAM, it is applying and rendering effects.

A much better GPU test within FCP X is the BruceX benchmark. It is extremely simple to run. You just import the XML, double-click in the event browser to put it on the timeline and hit the share button to export a 5k video file. Share>Master File>Settings
Format: Video & Audio
Codec: ProRes 422
Resolution: 5120x2700

Instructions and test file are here:

http://blog.alex4d.com/2013/10/30/brucex-a-new-fcpx-benchmark/

Hi Joema, thanks for the input, im aware of the BruceX benchmark. What i was trying to figure is the actual vram usage on the 5K display with a app like FCPX and Huffy was of a great help.

Although Brucex is a great representation how the GPU will handle high demanding tasks, my concern at this moment is real work flow and brucex does not represent what i do in real life.

Just to give some perceptive i work with a triple display setup 3x 1080p and ill be upgrading to an imac as soon as i have all my questions solved which at this moment is just the GPU :) It looks like the 395 is the way to go in terms of price / performance, and the concern is running besides the 5K plus 2x 1080p (I would have no problems in paying 300€ more on the 395X if it had a huge difference from the 395 but so far i haven't seen anything that justifies the price, i may be wrong)

After Huffy tests i calculate that 5K with FCPX is using roughly 800mb of vram and plus 2 displays will get up to 1GB of vram so it looks promising which is what im using +- at this moment having Premiere Timeline on Main display, Full 1080p preview on the left monitor and just some random aps on the right side (mail etc)

Now BruceX could be interesting if someone with the 395 and the 395X could run and compare the results (which probably someone already did :) and i didnt catch it )
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.