Blender Begins Testing Metal GPU Rendering on M1 Macs

amartinez1660 · Dec 16, 2021

mi7chy said:
$1200 Lenovo Legion Slim 7 with 3060 at 70W is 2.6x faster on BMW render than $3K+ MBP M1 Max. Desktop RTX is even faster at <10s.

16.39s - 3060 70W mobile (OptiX Blender 3.0)
20.57s - reference 6900xt (HIP Blender 3.0)
29s - 2070 Super (OptiX)
31s - 3060 70W mobile (OptiX Blender 2.93)
42.79s - M1 Max 32GPU (Metal supported Blender 3.1 alpha)
48s - M1 Max 24GPU (Metal Blender 3.1 alpha + patch)
51s - 2070 Super (CUDA)
1:18.34m - M1 Pro 16GPU (Metal Blender 3.1 alpha + patch)
2.04m - Mac Mini M1 (Metal Blender 3.1 alpha + patch)
2:48.03m - MBA M1 (Metal supported Blender 3.1 alpha)
3:55.81m - AMD 5800H base clock no-boost and no-PBO overclock (CPU Blender 3.0)
5:51.06m - MBA M1 (CPU Blender 3.0)

What’s that 2070 Super (CUDA), is it also mobile or desktop class? And are there 3060 mobile CUDA numbers there too?

It’s just to have a non hardware ray traced comparison baseline between the two for now. That said, I have seen some “metalRT” flags being dropped on some update comments on the blender repo files. Might hint at some metal enabled ray tracing. And who knows maybe they go the extra mile with CoreML for image denoising via the neural engine cores. Might help these numbers.

Would be also great to have comparisons on heavier scenes and constant everyday workflows.

What I have noticed on some videos is that the M1M might take 30s to render an image vs another RTX taking 20s, yet it will open blender, the blend file, compile some kernels (don’t really know what that is, it appears as a progress bar bottom), prepare some data before render starts and finish those 30s while the other is still stuck at some of the previous states for longer before it gets to the point to take the 20s render time.
I think that’s something important to take into consideration… except if used just as a render farm that is, no beating that use case yet.

amartinez1660 · Dec 16, 2021

BanjoDudeAhoy said:
While I agree with that in principal, I tried UE4 in one of the templates it ships with on my partner’s iMac (base model, 16GB) and it was painful to work with.
The M1 Max will surely run it better but I have my doubts that it will be enough for the development of a bigger project. Especially, considering that for large parts of the development you’re running unoptimized code and render unoptimized assets.

I’m looking forward to a future where it’s possible to develop that sort of thing on AS, though. Yes, please

At least UE people have said that macOS is still being worked on and to be fully supported.

anticipate · Dec 16, 2021

Keep in mind too Nvidia has a dedicated engine for rendering - Optix for Blender which is much faster than CUDA rendering. The early Metal implementation is about as fast as CUDA rendering (so sayeth the Blender users anyway) [EDIT - nope, it's still much slower] , so that puts it on par with a desktop 3060 or a laptop 3080 [more like a desktop 1080?]. I wouldn't render a movie with it, but it's certainly reasonably fast [for modeling as viewport performance is good, but not for render].

I have a PC with a 3080ti here and will try the alpha Metal render on the M1 max to see... if it's even in the same hemisphere. Not bad though!

EDIT - Compared to a desktop 3080ti (I know, not a fair comparison at ALL)

Classroom demo scene: M1 Max Metal/GPU Compute with Cycles alpha: 1:45 (mm:ss), 3080ti desktop Optix: 00:19, for CUDA 00:23

Barbershop demo scene: M1 Max Metal/GPU Compute with Cycles alpha: 9:41 (mm:ss), 3080ti desktop Optix: 01:08 (I didn't test CUDA)

CPU only scores on M1 Max are 2-3x slower than these scores.

The 3080ti on the PC is like 8-9x faster. If you extrapolate down, the M1 Max on compute is rather slow compared to even a 3080 laptop chip.

tomO2013 · Dec 16, 2021

amartinez1660 said:
What’s that 2070 Super (CUDA), is it also mobile or desktop class? And are there 3060 mobile CUDA numbers there too?

It’s just to have a non hardware ray traced comparison baseline between the two for now. That said, I have seen some “metalRT” flags being dropped on some update comments on the blender repo files. Might hint at some metal enabled ray tracing. And who knows maybe they go the extra mile with CoreML for image denoising via the neural engine cores. Might help these numbers.

Would be also great to have comparisons on heavier scenes and constant everyday workflows.

What I have noticed on some videos is that the M1M might take 30s to render an image vs another RTX taking 20s, yet it will open blender, the blend file, compile some kernels (don’t really know what that is, it appears as a progress bar bottom), prepare some data before render starts and finish those 30s while the other is still stuck at some of the previous states for longer before it gets to the point to take the 20s render time.
I think that’s something important to take into consideration… except if used just as a render farm that is, no beating that use case yet.

I didn’t want to pipe up before now, however the M1 Max result that mi7chy is sharing came from me originally.
I posted this result originally (and incorrectly in the stockfish chess engine thread) to illustrate that blender was now working in very very early state on Monterey M1 macbooks.

One thing to note, I ran this on a very very early build of the Metal patched blender BMW render and that (curiously omitted in the result set shared) , the 16” m1 max, 32 core, 64GB , 4TB machine score did not have the GPU or CPU’s running anywhere near 100% at the time. My score did not scale linearly as one would expect with GPU core count. In fact it was much much closer to the 16 core and 24 core GPUs on a 16” MBP.

I’d expect that there is a lot more optimization room left in the tank, again this is an early first stake in the sand and a great first step at that!

Based on other results shared, I’d advise folks who are seeing results shared with statements of 3X faster to go have a read of this thread...

3D Rendering on Apple Silicon, CPU&GPU

The 3.1 build I downloaded yesterday didn't have the Metal option but I've seen others post screenshots with the option.

forums.macrumors.com

As one of the other posters noted, Optix has very different qualitative output relative to a software only CPU renderer - this is very important to consider.

anticipate · Dec 16, 2021

tomO2013 said:
I didn’t want to pipe up before now, however the M1 Max result that mi7chy is sharing came from me originally.
I posted this result originally (and incorrectly in the stockfish chess engine thread) to illustrate that blender was now working in very very early state on Monterey M1 macbooks.

One thing to note, I ran this on a very very early build of the Metal patched blender BMW render and that (curiously omitted in the result set shared) , the 16” m1 max, 32 core, 64GB , 4TB machine score did not have the GPU or CPU’s running anywhere near 100% at the time. My score did not scale linearly as one would expect with GPU core count. In fact it was much much closer to the 16 core and 24 core GPUs on a 16” MBP.

I’d expect that there is a lot more optimization room left in the tank, again this is an early first stake in the sand and a great first step at that!

Based on other results shared, I’d advise folks who are seeing results shared with statements of 3X faster to go have a read of this thread...

3D Rendering on Apple Silicon, CPU&GPU

The 3.1 build I downloaded yesterday didn't have the Metal option but I've seen others post screenshots with the option.

forums.macrumors.com

As one of the other posters noted, Optix has very different qualitative output relative to a software only CPU renderer - this is very important to consider.

My test had the GPU and CPU pinned on my M1 Max. The chip is nowhere near as fast as an Nvidia 30 series for Blender rendering. Even if they 2X the figures, it won't be close. This is all very preliminary though ... and I am no blender expert. I just loaded in the demo files, ensured the Metal renderer was selected, made sure GPU Compute was selected, and hit render frame. The CPU and GPU were pinned.

My desktop 3080ti was 8-9x faster. The modeling & viewport rendering on the Max is decent though, but for compute it's only about the same as a Vega 64. It's raster where it's much faster... for stuff like video editing/coloring/motion gfx. This chip is not a RT render beast by any stretch.

It's still early days, but these early tests look bad.

mi7chy · Dec 16, 2021

amartinez1660 said:
What’s that 2070 Super (CUDA), is it also mobile or desktop class? And are there 3060 mobile CUDA numbers there too?

I was pooling people's submissions into one place but pretty sure that 2070 Super is the desktop version. No way a mobile 2070 is beating a mobile 3060. In a nutshell, OptiX > CUDA > OpenCL > Metal > CPU and Blender 3.0 > 2.93. 3060 70W is mine so reran it with CUDA (see below).

16.39s - 3060 70W mobile (OptiX Blender 3.0)
31s - 3060 70W mobile (OptiX Blender 2.93)
33.47s - 3060 70W mobile (CUDA Blender 3.0)

You can also browse through Blender's benchmark database but they don't have 3.0 results yet.

https://opendata.blender.org/benchm...0 Laptop GPU&device_type=CUDA&benchmark=bmw27

tubular · Dec 16, 2021

sunny5 said:
Also, I hope Unreal Engine supports Apple Silicon cause many people were impressed with M1 Max performance even with Rosetta 2.

That might be a little awkward at the moment.

tubular · Dec 16, 2021

ryanmillercg said:
Apple and Epic aren’t exactly on good terms right now. Unity and Godot and both AS native now, though!

Godot's AS native but still running OpenGL. They've spent a couple years redoing the entire rendering pipeline for (a) updated algorithms and (b) Vulkan but the 4.0 alpha unveiling it is still months away and I haven't been able to get the pre-alpha to do any actual frame rendering on Mac, PC, or Linux yet.

Mr.PT · Dec 17, 2021

vinegarshots said:
I mean, deprecated is the opposite of supports. There are no updates or patches or fixes for OpenGL on Mac, and the fact that it's deprecated means that it's slated for removal on Apple's whim. It's not a matter of if but when.

For Blender to rewrite its entire UI/Viewport to a Metal native version would be a much greater task than adding Metal support to Cycles...

Maybe Apple gives a helping hand. It would be mutually beneficial.

ryanmillercg · Dec 17, 2021

tubular said:
Godot's AS native but still running OpenGL. They've spent a couple years redoing the entire rendering pipeline for (a) updated algorithms and (b) Vulkan but the 4.0 alpha unveiling it is still months away and I haven't been able to get the pre-alpha to do any actual frame rendering on Mac, PC, or Linux yet.

Good point. Compared to UE4/UE5/Unity, Godot has the most responsive experience by far, and that’s due to the simple and 100% compatible CPU bound editor code. Here’s hoping their Vulkan/MoltenVK implementation goes well.

arvinsim · Dec 20, 2021

Is the difference of the M1 Pro and M1 Max linear?

ryanmillercg · Dec 20, 2021

arvinsim said:
Is the difference of the M1 Pro and M1 Max linear?

GPU-wise the performance scaling tends to be mostly linear.

From https://browser.geekbench.com/metal-benchmarks you can see:

M1 Max at 63769 (24-32 core)
M1 Pro at 39437 (14-16 core)
M1 at 20486 (7-8 core)

Boil · Jan 31, 2022

Looks like Blender is coming along nicely on macOS / Apple silicon...!

3.1.0 Beta & 3.2.0 Alpha available for download...!

vinegarshots · Jan 31, 2022

Boil said:
Looks like Blender is coming along nicely on macOS / Apple silicon...!

3.1.0 Beta & 3.2.0 Alpha available for download...!

That's still pretty poor performance though.

He's rendering that scene in 20 minutes on Apple Silicon GPU, but that same scene only takes 1 minute 51 seconds on my 3080 in my PC.

So it's still 10x slower than a new Nvidia card.

EDIT: His viewport performance is also quite bad in comparison to an nvidia card running Optix denoising...

Boil · Jan 31, 2022

vinegarshots said:
That's still pretty poor performance though.

He's rendering that scene in 20 minutes on Apple Silicon GPU, but that same scene only takes 1 minute 51 seconds on my 3080 in my PC.

So it's still 10x slower than a new Nvidia card.

EDIT: His viewport performance is also quite bad in comparison to an nvidia card running Optix denoising...

Well, we are on a Mac forums, discussing Blender on macOS/Apple silicon; so I would say the improvements shown are pretty good in the context of Blender on ASi/Metal...?

One can only wonder at what levels of performance we might see (if the M1 Ultra rumors are true) from ASi SoCs intended for the desktop...?

My speculation towards ASi desktop SoCs:

M1 Ultra - 12-core CPU (12P/0E) / 40-core GPU / 16-core NPU / 256GB LPDDR5X RAM / 500GB/s UMA
Dual M1 Ultra - 24-core CPU (24P/0E) / 80-core GPU / 32-core NPU / 512GB LPDDR5X RAM / 1TB/s UMA
Quad M1 Ultra - 48-core CPU (48P/0E) / 160-core GPU / 64-core NPU / 1TB LPDDR5X RAM / 2TB/s UMA

ponzicoinbro · Jan 31, 2022

vinegarshots said:
That's still pretty poor performance though.

He's rendering that scene in 20 minutes on Apple Silicon GPU, but that same scene only takes 1 minute 51 seconds on my 3080 in my PC.

FFS man you rending on a PC with with the CPU and GPU doing 500 watts.

That’s like someone at NASA loling at you because your desktop computer can’t calculate flights the same speed as their server.

vinegarshots · Jan 31, 2022

Boil said:
Well, we are on a Mac forums, discussing Blender on macOS/Apple silicon; so I would say the improvements shown are pretty good in the context of Blender on ASi/Metal...?

ponzicoinbro said:
FFS man you rending on a PC with with the CPU and GPU doing 500 watts.

That’s like someone at NASA loling at you because your desktop computer can’t calculate flights the same speed as their server.

The point is that for professional 3D work, rendering speed and performance is everything. Right now the performance still isn't good enough. Someday I'd like it to be so I can run a Mac again, but I'm getting impatient.

Plus, if the solution is for Apple to run tons of AS cores to make up the difference, then the power/efficiency isn't going to matter as much

mi7chy · Jan 31, 2022

ponzicoinbro said:
FFS man you rending on a PC with with the CPU and GPU doing 500 watts.

That’s like someone at NASA loling at you because your desktop computer can’t calculate flights the same speed as their server.

It's a GPU workload so CPU would be idle. Even a mobile 3060 configured for 70W TGP is ~5x faster at ~4m10s on Lone Monk which shows power consumption at <70W GPU and ~8W CPU. He should've let the render complete since projected time isn't always final render time and is usually less but close.

Boil · Jan 31, 2022

vinegarshots said:
The point is that for professional 3D work, rendering speed and performance is everything.

From a macOS using hobbyist viewpoint, Blender on a M1 Max Mac mini is pretty compelling...

vinegarshots said:
Plus, if the solution is for Apple to run tons of AS cores to make up the difference, then the power/efficiency isn't going to matter as much

A hypothetical dual or quad SoC configuration would likely use less power than a RTX 3090, and there is the added benefit of the Unified Memory Architecture providing more available RAM for the render process than the 3090...?

PBG4 Dude · Jan 31, 2022

I’m just hoping they bring this implementation to the intel macs. Would like to compare blender in Windows vs macOS using the same hardware for each.

Mr.PT · Feb 2, 2022

vinegarshots said:
The point is that for professional 3D work, rendering speed and performance is everything. Right now the performance still isn't good enough. Someday I'd like it to be so I can run a Mac again, but I'm getting impatient.

Plus, if the solution is for Apple to run tons of AS cores to make up the difference, then the power/efficiency isn't going to matter as much

Boil said:
From a macOS using hobbyist viewpoint, Blender on a M1 Max Mac mini is pretty compelling...

A hypothetical dual or quad SoC configuration would likely use less power than a RTX 3090, and there is the added benefit of the Unified Memory Architecture providing more available RAM for the render process than the 3090...?

Personally the fact software is being optimized for Apple silicon and Metal is wonderful news. As far as I know appart from software from the nemetschek group there’s not much out here.

me22 · Feb 5, 2022

vinegarshots said:
The point is that for professional 3D work, rendering speed and performance is everything. Right now the performance still isn't good enough. Someday I'd like it to be so I can run a Mac again, but I'm getting impatient.

This is always such a silly argument. The Mac performance here is the same as you would have gotten on a top of the line PC GPU just a few years ago, and unless you absolutely need the newest tech at the fastest speed to do your job, it is irrelevant.

"Isn't good enough" is very subjective for so many reasons, and absolutely laughable to me as someone who's successfully been doing professional 3D work on Macs primarily and PCs secondarily for a couple decades now.

FWIW, when I talk to Mac people who complain about render speed, but have never even tried a PC, I tell them they are missing out and should look into it. But also my PC is ready to go when I occasionally need a render speed boost, and even then I rarely use it.

vinegarshots · Feb 5, 2022

me22 said:
This is always such a silly argument. The Mac performance here is the same as you would have gotten on a top of the line PC GPU just a few years ago, and unless you absolutely need the newest tech at the fastest speed to do your job, it is irrelevant.

"Isn't good enough" is very subjective for so many reasons, and absolutely laughable to me as someone who's successfully been doing professional 3D work on Macs primarily and PCs secondarily for a couple decades now.

FWIW, when I talk to Mac people who complain about render speed, but have never even tried a PC, I tell them they are missing out and should look into it. But also my PC is ready to go when I occasionally need a render speed boost, and even then I rarely use it.

If you render a 2 minute long 3D animation at 1 minute per frame on a PC, that’s:
PC: 60 Hour Render
Mac: 600 Hour Render

It’s the difference between being able to render locally versus being forced to render on a render farm. In that aspect, then it is definitely not fast enough.

terminator-jq · Feb 6, 2022

vinegarshots said:
If you render a 2 minute long 3D animation at 1 minute per frame on a PC, that’s:
PC: 60 Hour Render
Mac: 600 Hour Render

It’s the difference between being able to render locally versus being forced to render on a render farm. In that aspect, then it is definitely not fast enough.

Do we know the reasons behind the big gap in rendering performance between the M1 Max and the Nvidia GPUs that are in its weight class? It seems crazy to me that we would see such a big difference in GPU rendering performance even though other GPU task put the M1 Max in 3070m territory.

Seems like the big gap could be due to a couple different things:

1. Hardware ray-tracing: If the rendering engine makes use of this technology, it can drastically reduce rendering times. IMO it was a big miss for Apple to not include ray tracing technology in a pro level chip. If this is the case, the gap may continue until we get into the M2 or perhaps M3 generation.

2. Rendering engines are just more optimized for Nvidia chip sets. If this is the case, there may be some hope that the rendering performance gap will lessen as software creators optimize performance on Apple Silicon.

Nermal · Feb 6, 2022

PBG4 Dude said:
I’m just hoping they bring this implementation to the intel macs. Would like to compare blender in Windows vs macOS using the same hardware for each.

The Blender 3.1 release notes say "The implementation is in an early state. Performance optimizations and support for AMD and Intel GPUs are under development."

Blender Begins Testing Metal GPU Rendering on M1 Macs

macrumors 68000

macrumors 68000

macrumors 6502a

macrumors member

macrumors 6502a

Suspended

macrumors 65816

macrumors 65816

macrumors 6502a

macrumors regular

macrumors 6502a

macrumors regular

macrumors 68040

macrumors 65816

macrumors 68040

Suspended

macrumors 65816

Suspended

macrumors 68040

macrumors 601

macrumors 6502a

macrumors member

macrumors 65816

macrumors 6502a

Moderator

Our Staff