Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

pers0n

macrumors regular
Original poster
I'm trying to decide between the M5 Pro and M5 Max. I'm thinking I need 36GB, 48GB or maybe/maybe 64GB (anymore than that is too expensive).

I know the AI stuff eats up RAM, but the models can eat up space also, so while the 1TB is good I have a feeling in a few years 2TB will be barely enough to have space for multiple models, unless you store them on an external drive which slows things down.

I can't decide if I get something great now or get something that will still be great 5yrs from now. I plan to keep this computer for a few years, if its a MAX I definitely have to keep it longer as the cost is very high, but it should still stay relevant longer, although the M7Pro might outdo the M5Max and the regular M8/M9 might outdo the M5Max.

Just trying to balance budget with costs and not sure where AI is going, I'm still learning how to use local models. I don't do any training, but who knows, as I learn more I don't want to be handicapped, its just the MAX is at an uncomfortable range, but is doable, if it will pay off.

What do you think?
 
Perspective: I don't do a ton of LLM work. I dabble in it. I do a significant amount with images.

Drive space: Definitely an issue. 1 TB can be fine for it, but you have to be proactive about removing models you no longer want to use. If you can stretch your budget to 2, I'd encourage it. Right now I have about 1.4 TB in actively used models between my LLM models and Image models. I could probably prune that by about half, maybe a little more, offloading ones I use less often to an external and only putting them back on-disk when I really needed them. That's still 3/4 of a TB, though. But you might find that you gravitate toward only a handful of models, in which case it's less of a concern.

Memory: Each application you use for AI will handle memory in a different way. Some do an excellent job of just keeping what's needed in memory at any given time, but still it's useful if you can keep the whole thing in memory and don't have to use swap. I have a MBA with 24 GB of memory, and I can use it effectively for AI, but there's a noticeable speed difference on it between using a model that can fit entirely in memory, vs one that only fits partially. My 128 GB MBP fits pretty much anything I want to throw at it, including when I'm doing local training. Not having enough memory will slow you down, but it's not a showstopper. I'd suggest looking at the models you're interested in running to see what the best fit will be.

Pro vs Max: The big difference here is going to be in your Graphics core count. For my own usage, this is key above all others, because the application I use for image generation (my primary AI usage) is almost perfectly linear in speed based on graphics cores. 2x the graphics cores gives you almost 2x the speed of generation. It utilizes every graphics core available and on my Max I'm always at 98% utilization on my memory cores. For this one, you're going to want to look at the applications you want to use for local AI and see how they utilize the graphics cores. Some AI applications won't engage all the graphics cores, so having more (getting a Max) won't really help you.

I'd recommend looking at how the applications you want to use utilize resources, particularly memory and graphics. I think that will make your choice between Pro and Max become obvious. Off-hand, I think based off of what you've mentioned and taking the M5 enhancements to AI processing into account, I would probably go with the M5 Pro with 64 GB of RAM. With a 2 TB drive if I could swing it. However, if you find that your desired applications DO support engaging all graphics cores, I'd see if there's any way to go with the Max with as much memory as I could reasonably afford.
 
You need to consider which models you intend to use, paying attention to how many billions of parameters are included. The size also depends on the quantization, e.g. are those 4 bit parameters or 8 bit?

For example, "quen3-coder-next" has 80 billion parameters, but the size depends on the quantization, so it ranges from 52 to 85 GB in size. Generally, figure that you need at least as many GB as the model has parameters.

Small models, e.g. 8 GB in size, work OK on a Macbook Air (if a little slowly), but be aware that smaller models are definitely worse and make more mistakes. I never quite trust them, but they are still handy for simpler questions, etc. In contrast, it's difficult to run the largest models on any consumer hardware. The full Deepseek R1, for example, has 641 billion parameters.

Next you need to consider the GPU. Figuring your needs here is harder to quantify, but the stronger the better as always, and RAM still couonts. If you use a GPU like a 5090, whatever ram it has is all you can realistically use and still have a responsive LLM. Macs fair better in the RAM department because of the unified memory structure. (A 5090 will do far better than any Mac as long as the model fits in its memory.)

On my M4 Air with 16 GB, an 8GB model runs well enough to use (I do something else while it thinks for a few minutes). It gets hot, though. On linux with a 4090, models are much faster - a 30 GB model can write code almost faster than I can watch it scroll. It's mind-blowing the first time you see that - the machines really are coming for us!

Finally, storage isn't really an issue for AI more than any other program like Blender, etc.

All that said, the preferred system for local LLMs on Mac is a Studio with as much ram as you can afford. A well-equipped Mini would be better than most laptops just due to all the heat.

I use Ollama (open source) for local LLM's. They (ollama.com) host lots of models, too.
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.