lol what
pytorch runs fine
Local models work fairly well despite the drawbacks of M4 (low inference speed mostly).
M5 will hugely improve this: per GPU core neural acceleration, matrix math improvements, etc. Inference should be significantly faster depending on how high they get the memory bandwidth and what they do with the interposer and SoIC / chip stacking, if it happens this generation.
There are metal optimized libraries that work pretty well and there is an
apple-funded CUDA -> MLX workflow in progress.
This machine is going to be excellent for personal and professional AI development / fine-tuning, and for ~$12k it’s going to cost what one nvidia card with 1/5 of the GPU memory does.
If you need really big data nvidia stuff you can just connect to the cloud, but for local work nothing will compare to this for a long time that isn’t 10x the cost.
I say all this also owning a 5090 that’s going into a Linux workstation next week for local CUDA work by the way.
Now, fi you were talking about Apple Intelligence… fair point! I have low faith especially given the new rumors that they are not teaming with Anthropic which is an enormous mistake, if true.
Apple’s small models are doing pretty cool stuff but they’re going to take a couple more years to be really relevant relative to the best of what else is out there now, and that is also dependent on them retaining talent. They are doing groundbreaking R&D work with on-device models particularly in RAM constrained areas, but everywhere else they are trailing pretty badly.