Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

sunny5

macrumors 68000
Original poster
Jun 11, 2021
1,845
1,713
Let's say M1 Max has 64GB unified memory. Theoretically, can M1 Max's GPU have up to 64GB VRAM?
 
There is no VRAM in the unified memory model. All RAM is usable by the GPU.
 
Like others have said it's not as simple as yes as the RAM is a shared pool. But yes, essentially the GPU can in theory use it all. The biggest pro is not having to hold duplicate assets in both RAM and VRAM like a Windows PC so in theory a well optimized application will need less overall RAM.

400GB/s is actually a lot of bandwidth, this is approximately RTX 3070 levels of memory bandwidth, but unlike with a desktop gaming rig the CPU shares that bandwidth and gets to use that insane speed. Most desktops get ~30GB/s memory bandwidth from dual channel system RAM, so 400GB/s is quite a massive increase.
 
Let's say M1 Max has 64GB unified memory. Theoretically, can M1 Max's GPU have up to 64GB VRAM?
The whole idea of needing VRAM is because the data path to the GPU is slow, so you use VRAM to hold data on the GPU side of the bottleneck. Once the bottleneck is removed, the need for VRAM goes away. Yes all of the RAM can be used for any purpose and the data never needs to move across a PCIe bus.
 
It’s not so simple, as ram would still be needed by the system, but essentially yes.

Theoretically, yes.

As with all technical questions, the answer gets muddy.
With this one I don’t think so. The GPU has access to all 64Gb regardless of what the CPU is doing. There is no ram partitioning like on Intel chips. UMA allows 100% access to both GPU and CPU at the same time.
 
  • Like
Reactions: osplo
With this one I don’t think so. The GPU has access to all 64Gb regardless of what the CPU is doing. There is no ram partitioning like on Intel chips. UMA allows 100% access to both GPU and CPU at the same time.

Not quite: it’s true that there isn’t a strict partition but there does appear to be a limit on how much RAM the GPU can subscribe in order to ensure that the cpu has x amount of space for things it might need. So there are no duplications and the CPU and GPU can both see the entire RAM stack but the GPU has limits on how much it can use for solely itself. How this works in under the hood is probably pretty interesting.
 
Not quite: it’s true that there isn’t a strict partition but there does appear to be a limit on how much RAM the GPU can subscribe in order to ensure that the cpu has x amount of space for things it might need. So there are no duplications and the CPU and GPU can both see the entire RAM stack but the GPU has limits on how much it can use for solely itself. How this works in under the hood is probably pretty interesting.

Where have you seen this? Anything I can read/watch to learn more?
 
Last edited:
A quick game test, Total War 3 Kingdoms see the graphic card has 43GB of VRAM...
If I'm not wrong, macOS controls how much RAM is allocated for each process' 'VRAM' allocation. While Total War 3 Kingdoms may see 43GB, macOS itself would also need some 'VRAM' for its GUI rasterisation.

In any case, the concept of 'VRAM' is no longer relevant in macOS running in UMA Macs. Processes can ask for as much as macOS allows.
 
If I'm not wrong, macOS controls how much RAM is allocated for each process' 'VRAM' allocation. While Total War 3 Kingdoms may see 43GB, macOS itself would also need some 'VRAM' for its GUI rasterisation.

In any case, the concept of 'VRAM' is no longer relevant in macOS running in UMA Macs. Processes can ask for as much as macOS allows.
Any apps that can accurately monitor VRAM usage? just curious to see what happened when I enable the unlimited Vmem on TW 3 Kingdom.
 
All GPU accessible ratios point anywhere between 60-68.5% of the total ram available to the GPU.
I hope they allow a higher percentage for the Mac pros with more than one AS Soc..
 
Where have you seen this? Anything I can read/watch to learn more?

Hmmm nothing comprehensive unfortunately, it’s just been noted that applications that say how much GPU Ram they can make use of it is always around 2/3 of the total. In contrast, I’ve not seen any such limit to a process normally. So in practice there is a limit to how much RAM a program can use dedicated to the GPU. However unclear exactly how this limit works or what happens if multiple programs are in use. I’ll see if I can dig up something more concrete.

Edit: @Boil ‘s post above linked to a post from @singhs.apps that’s pretty good, but I don’t think anyone has done a full technical analysis - at least not that I’ve found.
 
Last edited:
  • Like
Reactions: Jorbanead
Hmmm nothing comprehensive unfortunately, it’s just been noted that applications that say how much GPU Ram they can make use of it is always around 2/3 of the total. In contrast, no such limit seems to exist for assigning memory to be processed by the CPU. So in practice there is a limit to how much RAM a program can use dedicated to the GPU. However unclear exactly how this limit works or what happens if multiple programs are in use. I’ll see if I can dig up something more concrete.
Most likely because CPU processes can make use of virtual memory, while GPU processes should not? I would think GPU processes should only be wired to physical memory space while the CPU can make use of the MMU.
 
Any apps that can accurately monitor VRAM usage? just curious to see what happened when I enable the unlimited Vmem on TW 3 Kingdom.
Not that I know of tho. Not sure if macOS has APIs to allow queries of other processes memory usage.
 
Most likely because CPU processes can make use of virtual memory, while GPU processes should not? I would think GPU processes should only be wired to physical memory space while the CPU can make use of the MMU.

I dunno ?‍♂️ maybe - it’s a good hypothesis. Unfortunately I don’t have one of my own test out and try to figure out the contours of these kinds of things. So there’s a limit to what I can say beyond reading what others have written. You know … it could also be an interesting question to ask of the Asahi Linux people when they finally get their graphics driver stack finished what limits if any there are on their end.
 
  • Like
Reactions: quarkysg
Hmmm nothing comprehensive unfortunately, it’s just been noted that applications that say how much GPU Ram they can make use of it is always around 2/3 of the total. In contrast, I’ve not seen any such limit to a process normally. So in practice there is a limit to how much RAM a program can use dedicated to the GPU. However unclear exactly how this limit works or what happens if multiple programs are in use. I’ll see if I can dig up something more concrete.

Edit: @Boil ‘s post above linked to a post from @singhs.apps that’s pretty good, but I don’t think anyone has done a full technical analysis - at least not that I’ve found.
AFAIK there is no direct way querying how much memory does a metal device actually 'has', but the MTLDevice protocol does offer a value called recommendedMaxWorkingSetSize which means 'An approximation of how much memory, in bytes, this device can use with good performance.' I doubt most programs use this to probe how much memory a Metal device can use and that does not equal to the total memory available to the system but a practical upper limit that a Metal device can use without performance penalty.

Screen Shot 2021-11-03 at 10.17.13.png
 
Last edited:
AFAIK there is no direct way querying how much memory does a metal device actually 'has', but the MTLDevice protocol does offer a value called recommendedMaxWorkingSetSize which means 'An approximation of how much memory, in bytes, this device can use with good performance.' I doubt most programs use this to probe how much memory does a Metal device can use and that does not equal to the total memory available to the system but a practical upper limit that a Metal device can use without performance penalty.

View attachment 1901156

Interesting … however this apparent limit seems to exist in opencl too though as has been noted opencl is just a metal wrapper these days.

Edit: I’m not familiar with Metal but I know that openCL and CUDA have the ability to create different memory pools as device and host even in a “unified memory” programming model … can someone try the same in macOS and fill up the RAM with “GPU memory”?
 
Last edited:
can someone try the same in macOS and fill up the RAM with “GPU memory”?
Screen Shot 2021-11-03 at 10.58.12.png

A quick and dirty test showed that I can have an allocated size even larger than the physical ram by creating zero-filled buffers. The max buffer length on M1 is 8GB so I created 3 with 24GB total, and at least the playground does not crash on my 16GB M1 Mac, and the MTLDevice indeed reported 24GB has been allocated. Maybe we could test again with MTLHeap and actually fill it with something not zero to see if the result differs.
 
Regarding “maximal available memory”, there is no limit. Metal has a parameter returning “maximal memory you can use with good performance” which is probably the 43GB people have reported, but it’s just a guideline. You can allocate and use much more than that, it’s just that you might run into some swapping along the way.

@Gnattu: allocating is not enough, buffers are allocated lazily, you actually have to use them to have them pinned in memory. It’s the same on Intel Macs. You can allocate whatever you want with Intel or AMD GPUs but if you say start a compute kernel that bind buffers of total size larger than the GPU can handle, the app will crash. I expect no crashes on Apple Silicon, just swapping. Someone should test it. Still waiting for my Max ?
 
  • Like
Reactions: crazy dave
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.