Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Yes but it is not a concern for some applications (like AI inference) because they don't require a lot of host-GPU communication bandwidth.

This is incorrect. Because the data being used to run the inference models has to be loaded into the eGPU RAM first, using just a USB connection results in significantly slower overall processing speeds. This also affects sending the processed data back to the CPU/SoC. It doesn't matter how much memory bandwidth the GPU has if data can only be trickled between it and the CPU. Even a smaller model such as Llama 3.1 8B takes up almost 5GB of disk, and that's a lot to transfer over just a USB connection. Larger models such as Llama 32B load over 17GB into vRAM, which would result in significantly slower overall performance even with a card like the RTX 5090. The bandwidth is needed to load the models onto the GPU as quickly and efficiently as possible.
 
This is incorrect. Because the data being used to run the inference models has to be loaded into the eGPU RAM first, using just a USB connection results in significantly slower overall processing speeds. This also affects sending the processed data back to the CPU/SoC. It doesn't matter how much memory bandwidth the GPU has if data can only be trickled between it and the CPU. Even a smaller model such as Llama 3.1 8B takes up almost 5GB of disk, and that's a lot to transfer over just a USB connection. Larger models such as Llama 32B load over 17GB into vRAM, which would result in significantly slower overall performance even with a card like the RTX 5090. The bandwidth is needed to load the models onto the GPU as quickly and efficiently as possible.
Yes but loading the model is a one-time cost, which can be amortized if one runs the job for a long time.

Not saying using USB is ideal, but it can be useful for a limited set of use cases.
 
I there a eGPU Card for Thunderbolt 5 which can use for Gaming on Mac Studio M4 Max?
 
Yes but loading the model is a one-time cost, which can be amortized if one runs the job for a long time.

Not saying using USB is ideal, but it can be useful for a limited set of use cases.

Even if the model was running 24/7, transferring the data back and forth between the CPU and GPU over USB would incur a significant penalty with respect to the rate at which data is received.
 
eGPUs are going the ways of SLI, an interesting option but not feasible in the long run. For gamers, there's definitely cheaper alternatives be they gaming PCs, consoles or handhelds like Steam Deck. For non-gamers who either want the processing power, again there are better tools for the job.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.