iPhone 17 Pro 17 pro handles 400B Ai model

hovscorpion12 · Mar 24, 2026

This is the most impressive feat i have ever seen. Especially on a mobile device.

Do you we really need $20,000+ M5 Ultra with 512GB of RAM?

My only question is how hot did the 17 pro get?

If the A19 Pro can handle a 400B AI model, is there truly a need for 128GB - 512GB of RAM? or does it come down to how the model was written?

Either way, love or hate Ai. This is just the start.

iPhone 17 Pro Successfully Demonstrated Running A 400B Large Language Model, A Feat That Requires Minimum Of 200GB Memory Even When Compressed

One individual has displayed to the world that it’s possible to run a 400B LLM on an iPhone 17 Pro, but only if some smart tweaks are involved

wccftech.com

johannnn · Mar 24, 2026

hovscorpion12 said:
If the A19 Pro can handle a 400B AI model, is there truly a need for 128GB - 512GB of RAM? or does it come down to how the model was written?

The reason we still need an M5 Ultra with 512GB of RAM is speed. The iPhone is hitting 0.6 tokens per second (basically reading speed for a snail), whereas a high-RAM Mac can keep the entire model in memory for near-instant responses. Still, seeing a 400B model "fit" in a pocket is really cool

hovscorpion12 · Mar 24, 2026

Yes, obviously, having no one wants a slow model. Having the 512Gb RAM model (or even the 128GB model) providing blazing fast speeds is ideal. A 400B model taking 30+ minutes to load is not ideal. It is however cool to see and potentially see the future of what "could be possible" on a smartphone in the pocket.

Zondar · Apr 3, 2026

Crazy but cool.

I use local models on both mac and linux. The biggest model I run is a 80B parameter coding model, on a 24 GB 4090, 64 GB RAM and a 16 core CPU. It's a bit slow, but I do other things while it thinks for a few minutes. Models that fit completely in the 4090 are very fast. I'll try one of the mixture of experts models soon, which may do better.

On my base Macbook Pro M5 with 32 GB, I usually use 30B parameter models. Those run at about the same speed as the 80B model on my linux box.

Mac's aren't a panacea for LLM's, but their big advantage is the unified memory. By comparison, my linux box has to split work between the GPU and, once its memory runs out, the much less efficient CPU. I can see myself investing in sufficient hardware to run full-sized models one day, but I suspect that won't be on a mac.

If you have a M-series mac, download Ollama (ollama.com) and whatever model(s) fits in your memory, and have fun!

Search

Search

iPhone 17 Pro 17 pro handles 400B Ai model

hovscorpion12

macrumors 68040

iPhone 17 Pro Successfully Demonstrated Running A 400B Large Language Model, A Feat That Requires Minimum Of 200GB Memory Even When Compressed

johannnn

macrumors 68030

hovscorpion12

macrumors 68040

Zondar

macrumors newbie

Our Staff