Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Everything currently available on the consumer market is very underpowered. Nvidia is set to launch the first dedicated AI accelerators this year. This will be the Nvidia Spark for the consumer market. The maximum AI computing power will be 1 Petaflop...
Everything else on the consumer market is, for now, just a gimmick. Note also that Nvidia and ATi are putting new graphics card launches on hold for two years. Apple is also standing still in this regard, unable to exceed the performance of the RX 6900XT... But in two or perhaps three years’ time, we’ll see a new generation of AI-capable graphics cards at a price accessible to almost everyone

Ok
 
Everything currently available on the consumer market is very underpowered. Nvidia is set to launch the first dedicated AI accelerators this year. This will be the Nvidia Spark for the consumer market. The maximum AI computing power will be 1 Petaflop...
Everything else on the consumer market is, for now, just a gimmick. Note also that Nvidia and ATi are putting new graphics card launches on hold for two years. Apple is also standing still in this regard, unable to exceed the performance of the RX 6900XT... But in two or perhaps three years’ time, we’ll see a new generation of AI-capable graphics cards at a price accessible to almost everyone 😃
The RTX Spark is a consumer DGX Spark and Apple's current lineup already beats it at a lot of workloads especially because of how the CPU cores perform. I personally don't have a ton of use for FP4 which is where that 'petaflop' comes from, some of the work I do needs FP16, FP32, or even FP64 which I have to fall back to the CPU for but thanks to Apple's memory bandwidth it is workable for certain things.

For CUDA optimized workloads at very low precision it will be okay but I doubt they will even match the M4 Max memory bandwidth with the Spark, the charts I read have it coming in at around 300GB/s which is not ideal.

Apple Silicon is not going to replace a server rack of Blackwells anytime soon but at the high consumer end they are very competitive for AI work that goes beyond 'let me run an openclaw machine or experiment with the CUDA toolchain'. If anything I think the DGX Spark is a weirdly positioned machine and outside of the Linkedin influencer posts about 'I just got it and set up an agent!' which anyone with an M4 Mac Mini can do I haven't seen anything groundbreaking from anyone who has one.

I'd take one for free or at a hefty subsidy or if I was locked into the CUDA ecosystem but blissfully I am not, and I have a 5090 that I do CUDA work on when I do need to power through something specific. BF16 is also a lot more useful than (NV)FP4.

And for apples-to-apples comparison's sake: the 5090 does 3.35PFLOPs at FP4 sparsity, which is where nvidia gets the 'one petaflop supercomputer!' for the DGX Spark and RTX Spark from in their somewhat misleading marketing. Yes you're getting a unified memory machine that can run larger models if you shell out for it (the DGX Spark went up in price to $4799 now), but I'd be hard-pressed to choose that strange ecosystem over Apple's at current prices especially since the M5 Architecture improved the matrix math so much which helps TTFT a lot and was a bottleneck on previous generations.

Caveat being of course, Apple's current prices for new models are going to go up quite a bit, so who knows where things will land then.
 
Everything currently available on the consumer market is very underpowered. Nvidia is set to launch the first dedicated AI accelerators this year. This will be the Nvidia Spark for the consumer market. The maximum AI computing power will be 1 Petaflop...
Everything else on the consumer market is, for now, just a gimmick. Note also that Nvidia and ATi are putting new graphics card launches on hold for two years. Apple is also standing still in this regard, unable to exceed the performance of the RX 6900XT... But in two or perhaps three years’ time, we’ll see a new generation of AI-capable graphics cards at a price accessible to almost everyone 😃

Spark has been available for some time now, it's essentially an RTX 5070. The 1000 TFLOPS quoted is for peak throughout FP4 with sparsity — you won't get anywhere close to thee figures in practice. The M5 Max should be quite competitive with RTX 5070 for most machine learning applications unless we are looking at models that are highly optimized for Nvidia's tensor cores.
 
  • Like
Reactions: novagamer
Everything currently available on the consumer market is very underpowered. Nvidia is set to launch the first dedicated AI accelerators this year. This will be the Nvidia Spark for the consumer market. The maximum AI computing power will be 1 Petaflop...
Everything else on the consumer market is, for now, just a gimmick. Note also that Nvidia and ATi are putting new graphics card launches on hold for two years. Apple is also standing still in this regard, unable to exceed the performance of the RX 6900XT... But in two or perhaps three years’ time, we’ll see a new generation of AI-capable graphics cards at a price accessible to almost everyone 😃
Bandwidth of the DGX Spark?...
Yeah, I thought so...
Amateurs talk FLOPs, professionals talk bandwidth.
 
  • Like
Reactions: leman and novagamer
Professionals for what?

The true answer is to roofline your code to see whether your algorithm is compute or memory bound and go from there

LLMs are pretty much always memory bound. For example, an 8B parameter model requires at least 4-5GB even with high weight compression.
 
Try to understand the differences. Nvidia Spark achieves AI performance of 1,000 TFlops, whilst the fastest Mac computer manages 70 TFlops. There’s no comparison here. This isn’t an RX5070, let alone an M5 Max. In practice, projects that would normally take hours to generate will be completed in minutes 😃
 
Try to understand the differences. Nvidia Spark achieves AI performance of 1,000 TFlops, whilst the fastest Mac computer manages 70 TFlops. There’s no comparison here. This isn’t an RX5070, let alone an M5 Max. In practice, projects that would normally take hours to generate will be completed in minutes 😃
Do you understand what FP4 is? Because if not, you should probably go do a few months of reading if this is an area that interests you. Genuine suggestion.

Multiple posts cleanly refuted your assertion here with facts. If you're trolling us, that's frowned upon here especially in this area of the forum.
 
Try to understand the differences.

Before you can point out differences to others, it helps to understand the fundamentals yourself.

This isn’t an RX5070, let alone an M5 Max.

This is literally a RTX 5070

table 6, https://images.nvidia.com/aem-dam/S...ell/nvidia-rtx-blackwell-gpu-architecture.pdf


In practice, projects that would normally take hours to generate will be completed in minutes 😃

You are a bit out of your depth here, aren’t yiu?
 
I’m more interested in specific applications. Does anyone know how long it takes to
upscaling SD video to 4K in Topaz AI using a Mac M5 Max?
I’m talking about clips that are an hour long or longer.
 
Has anyone been using LM Studio anything similar for local inference? Wanna leverage some of it with my 48GB. Mainly used Claude Code and Codex so far...
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.