Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

hovscorpion12

macrumors 68040
Original poster
This is the most impressive feat i have ever seen. Especially on a mobile device.

Do you we really need $20,000+ M5 Ultra with 512GB of RAM?

My only question is how hot did the 17 pro get?

If the A19 Pro can handle a 400B AI model, is there truly a need for 128GB - 512GB of RAM? or does it come down to how the model was written?

Either way, love or hate Ai. This is just the start.

 
If the A19 Pro can handle a 400B AI model, is there truly a need for 128GB - 512GB of RAM? or does it come down to how the model was written?
The reason we still need an M5 Ultra with 512GB of RAM is speed. The iPhone is hitting 0.6 tokens per second (basically reading speed for a snail), whereas a high-RAM Mac can keep the entire model in memory for near-instant responses. Still, seeing a 400B model "fit" in a pocket is really cool
 
Yes, obviously, having no one wants a slow model. Having the 512Gb RAM model (or even the 128GB model) providing blazing fast speeds is ideal. A 400B model taking 30+ minutes to load is not ideal. It is however cool to see and potentially see the future of what "could be possible" on a smartphone in the pocket.
 
Crazy but cool.

I use local models on both mac and linux. The biggest model I run is a 80B parameter coding model, on a 24 GB 4090, 64 GB RAM and a 16 core CPU. It's a bit slow, but I do other things while it thinks for a few minutes. Models that fit completely in the 4090 are very fast. I'll try one of the mixture of experts models soon, which may do better.

On my base Macbook Pro M5 with 32 GB, I usually use 30B parameter models. Those run at about the same speed as the 80B model on my linux box.

Mac's aren't a panacea for LLM's, but their big advantage is the unified memory. By comparison, my linux box has to split work between the GPU and, once its memory runs out, the much less efficient CPU. I can see myself investing in sufficient hardware to run full-sized models one day, but I suspect that won't be on a mac.

If you have a M-series mac, download Ollama (ollama.com) and whatever model(s) fits in your memory, and have fun!
 
  • Like
Reactions: hovscorpion12
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.