Expected GPU performance of M1X/M2

Serban55 · Oct 17, 2021

dgdosen said:
When new chips comes out, I like to say they're now the 'crappiest tech' you'll ever use moving forward.

Meaning, no matter how good these new chips may be, they'll soon be surpassed in terms of speed, efficiency, transistor count, cost(?), etc by the next generation of chips (by Apple or others), which are already in the pipeline.

That next generation of technology will soon relegate this tech to the bargain bin of craigslist/swappa/marketplace.

what do you mean? the next apple X generation will come at least 12 months later or even 18 months if they go ipad path
And what others? i bet for the next 2 years, no one could deliver this kind of cpu+gpu performance/W

altaic · Oct 17, 2021

Serban55 said:
yes, you are right..if that leak will come true...i mean look at that profile compared to the usbC port...
Is Apple going to deliver a monster 16" Mbp...at 5 pounds and twice as thick? like the post retina display era ?!
but what for? since the M1x will be not as hot as the Intel cpu+dGpu....the 16" already have the biggest battery capacity allowed..so whats the point ?!

My guess is extra battery. If they announced a mobile workstation that with a workstation load drained the battery in a couple of hours, they’d get all the flack that they got from the intel machines throttling. If it can do 8 hours at full tilt, that’s something that would shut naysayers up and get props from people who actually want to get stuff done.

Serban55 · Oct 17, 2021

altaic said:
My guess is extra battery. If they announced a mobile workstation that with a workstation load drained the battery in a couple of hours, they’d get all the flack that they got from the intel machines throttling. If it can do 8 hours at full tilt, that’s something that would shut naysayers up and get props from people who actually want to get stuff done.

its not legal to go extra battery on the 16" it already has Built‑in 100‑watt‑hour lithium
Again if in fully load my Intel 16" with dGpu could keep around 3-4 hours ...i bet this M1x can keep for double that in fully load...so , no is not the extra battery

altaic · Oct 17, 2021

Serban55 said:
its not legal to go extra battery on the 16" it already has Built‑in 100‑watt‑hour lithium
Again if in fully load my Intel 16" with dGpu could keep around 3-4 hours ...i bet this M1x can keep for double that in fully load...so , no is not the extra battery

Hmm… good point about the 100 Wh aviation limit. Maybe they figured out a clever loophole? That’ll be interesting to find/figure out tomorrow.

Serban55 · Oct 17, 2021

i really hope Johny Srouji will "take the stage today also"

iBug2 · Oct 18, 2021

leman said:
The problem is not performance, the problem is form factor and the software ecosystem. A lot of people need a full computer, not a "tailored" experience. Can't really do university or office work on the iPad.

For certain tasks iPad performs better than a "full computer" though. Editing photos on an iPad with Apple pencil is easier for me than doing it with a mouse. But the software on iPad is always missing certain pieces compared to the desktop version, but eventually that will disappear.

leman · Oct 18, 2021

joelypolly said:
I'd love to learn a bit more about if you have any references! I've always been interested in graphics programming but have spent most of my time on the CPU side of things

This is indeed a fascinated topic! I learned most of what I know from this very well written blog series:

A look at the PowerVR graphics architecture: Tile-based rendering - Imagination

Rys looks at our PowerVR graphics architecture and describes how Tile-Based Rendering (TBR) works in practice.

www.imaginationtech.com

A look at the PowerVR graphics architecture: Deferred rendering - Imagination

The deferred rendering technique in PowerVR GPUs takes the information generated by the tiler to defer the rendering of subsequently generated pixels.

www.imaginationtech.com

It is worth noting that Apple likely adapted their TBDR hardware from Imagination directly (Apple's shading backend is entirely different though).

@mr_roboto already summed it up in a very neat way. I would only add that tile based rendering as most modern GPUs do is a caching optimization (by processing triangles in a close proximity to each other it's more likely that you will fetch texture data that is closely laid out in the memory), but triangles are still rasterized and shaded immediately. The big difference to TBDR GPUs (currently only Apple and IMG) is that they will first rasterize every single triangle in the tile and only then do shading. This means that every visible pixel is shaded exactly once. While the result is the same, the underlaying implications are actually very significant: TBDR GPUs can dispatch shading work in regular grids of 32x32 pixels (always doing 1024 pixels per shader invocation), while immediate renderers have to dispatch shaders for each triangle individually (they usually side 8x4 groups of pixels at once, which creates inefficiencies at triangle edges). Also, the TBDR model offers guarantees that the immediate model simply can not. For example:

- with TBDR you know that you "own" the pixel as no other shader invocation be computing the value of that pixel at the same time (in immediate rendering there can be data races, e.g. two overlapping triangles may be shaded simultaneously). This allows you to do things like race-free framebuffer reads that are key to programmable blending and other advanced techniques

- with TBDR you know that all other triangles in the same tile have already been rasterized, this allows you to deterministically track the state of multiple pixels at once. This again allowed Apple to do implement some really cool stuff like persistent cache between shader invocations (multiple shaders can work on the data in sequence, something that no other GPU supports) which is the key to performing some advanced effects without ever leaving the GPU cache

mr_roboto said:
I'd say it's really about both. A lot of the memory bandwidth savings occur because TBDR never shades fully occluded pixels, meaning it doesn't have to fetch texture data for that pixel. But... it also doesn't have to run the shader program against that pixel, so it's saving both at the same time.

You are right of course. I was stressing the bandwidth issue since Apple GPUs are much more limited in this area. TBDR allows them to fetch only what is needed, saving precious bandwidth.

mr_roboto said:
Of course, the flip side of that is that TBDR is weak when rendering transparency effects, especially if application devs don't take care to optimize their rendering pipeline the best way to handle transparency on TBDR GPUs.

I don't thin it's weak per se, it just won't be more efficient (it will have to shade transparent triangles immediately, just like an IMR would). There is still some potential for efficiency wins (e.g. if you have multiple non-overlapping transparent triangles in a tile). I suspect that the GPU will flush the tile when a newly rasterized transparent pixel hits a not-yet-shaded transparent pixel. So basically, if you have a lot of transparent pixels with plenty overdraw, TBDR performance will tank due to constant flushing. But so will IMR performance.

Transparency is a big problem anyway, and if I remember correctly the optimization tips for TBDR and IMR are identical: draw transparent objects last (you have to draw them last and sorted anyway if you want correct results). And then again there is order independent transparency which again can be done more efficiently on Apple GPUs.

Kpjoslee said:
Nvidia's GPU since Maxwell do use tile based rendering. As of now, pretty much every architecture out there seems to be using tile based rasterization. ARM Mali, Qualcomm Adreno, AMD, Intel, and Nvidia all uses tile based renderer lol.

Kpjoslee said:
Not sure about others, but Nvidia's approach is close to the TBDR since Maxwell.

It is not close, since it's not deferred. Saying that they are close because they both use tiles is like saying that a hybrid is basically the same as a full electric car

Using tiles makes sure that the modern GPU will on average fetch texture data from similar locations, which massively improves caching and is the key for the large performance and efficiency increased of Maxwell and Navi. But true TBDR is much more difficult to achieve because there are just so many edge cases... Imagination was basically the only company to ever do it, and Apple bought the tech from them.

And regarding most mobile GPUs (Mali, Adreno), these are simply atrocious. They cut a lot of corners in order to achieve higher efficiency (e.g. shader splitting) which results in inconsistent programming models. They don't scale with complex geometries since they don't use deferred shading. But it's good enough for lower quality games on mobile that don't try to do anything ambitious.

cknlol · Oct 18, 2021

If benchmark performance could in fact be on par or close to a 3060 or even 3070 I'd be so freaking impressed.

leman · Oct 18, 2021

cknlol said:
If benchmark performance could in fact be on par or close to a 3060 or even 3070 I'd be so freaking impressed.

Prepare to be impressed then

With a caveat — they should be on par with the mobile Ampere GPUs (~100W TDP). The desktop class Ampere is something entirely different... can't really expect a 40W Apple GPU to compete with a 200W monster.

darngooddesign · Oct 18, 2021

It will be pretty...pretty good.

diamond.g · Oct 18, 2021

leman said:
Prepare to be impressed then With a caveat — they should be on par with the mobile Ampere GPUs (~100W TDP). The desktop class Ampere is something entirely different... can't really expect a 40W Apple GPU to compete with a 200W monster.

When showing off Apple Silicon originally they showed SOTR, then they snuck in some Metro Exodus at the last WWDC. Do we think they will get 4AGames to show Metro Exodus Enhanced Edition this time around?

joelypolly · Oct 18, 2021

Holy ****, 400gb/s bandwidth

EntropyQ3 · Oct 18, 2021

joelypolly said:
Holy ****, 400gb/s bandwidth

😀😀😀😀😍😍😀😀😀

I’m happy. Because they still haven’t said the price.🤣

Amethyst · Oct 18, 2021

if M1 max score 62 Mh/s Eth hashrate at 50 watt, it will be instantly out of stock.

Serban55 · Oct 18, 2021

400gb/s memory bandwidth and 64 gb ram
70% improvment cpu vs M1 - Jesus
4x gpu improvement vs M1- Jesus again

Krevnik · Oct 18, 2021

Serban55 said:
400gb/s memory bandwidth and 64 gb ram
70% improvment cpu vs M1 - Jesus
4x gpu improvement vs M1- Jesus again

4x would put it in W5700XT territory of the Mac Pro, at what looked like a 55W envelope. Jeeeebus.

jsnuff1 · Oct 18, 2021

Really want to see if this GPU can compete with 6700-6800 which is what they are essentially are saying. Maybe on paper but highly double real-world performance will match that of higher end discreet GPUs...

diamond.g · Oct 18, 2021

jsnuff1 said:
Really want to see if this GPU can compete with 6700-6800 which is what they are essentially are saying. Maybe on paper but highly double real-world performance will match that of higher end discreet GPUs...

I am somewhat disappointed there still was no mention of realtime ray tracing in hardware.

Serban55 · Oct 18, 2021

diamond.g said:
I am somewhat disappointed there still was no mention of realtime ray tracing in hardware.

those specifics are for us to find..Apple is not telling us the whole story

diamond.g · Oct 18, 2021

Serban55 said:
those specifics are for us to find..Apple is not telling us the whole story

Yeah, I am happy about being able to move to a smaller notebook for my next go around though. It will be interesting to see what comes of having the (potential for) equivalent to PS5 in Teraflops on Apple's most sold systems will do though.

joelypolly · Oct 18, 2021

Just ordered the 14" with M1 Max, 64GB with 4TB. Anyone want a minimally used Mac Pro?

Krevnik · Oct 18, 2021

jsnuff1 said:
Really want to see if this GPU can compete with 6700-6800 which is what they are essentially are saying. Maybe on paper but highly double real-world performance will match that of higher end discreet GPUs...

Going by Metal Benchmarks, the M1 Max should land just shy of the 6700, TBH. Peak Flops are under 11TFlops for the M1 Max, which puts it more in the 5700XT category at around 9.7TFlops. Still, the W5700XT (which I've used) is a 200+W card facing off against a 55W GPU design.

If they even get in the ballpark I'll be happy.

motomotomoto · Oct 18, 2021

Does anyone know if the single core performance of the M1Pro and M1 Max is any different? That is the key factor for me in choosing which chip I need.

joelypolly · Oct 18, 2021

motomotomoto said:
Does anyone know if the single core performance of the M1Pro and M1 Max is any different? That is the key factor for me in choosing which chip I need.

You'd expect them to be about the same

Serban55 · Oct 18, 2021

joelypolly said:
Just ordered the 14" with M1 Max, 64GB with 4TB. Anyone want a minimally used Mac Pro?

what M1 max with 32 gpu core?

Expected GPU performance of M1X/M2

Suspended

macrumors 6502a

Suspended

macrumors 6502a

Suspended

macrumors 601

macrumors Core

macrumors newbie

macrumors Core

macrumors Core

macrumors G5

macrumors 6502a

macrumors 6502a

macrumors 6502a

Suspended

macrumors 601

macrumors 6502a

macrumors G5

Suspended

macrumors G5

macrumors 6502a

macrumors 601

macrumors regular

macrumors 6502a

Suspended

Our Staff