Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Flowstates

macrumors 6502
Original poster
Aug 5, 2023
393
577
Nuff said, are there any MLheads in the forum and potential insights on picking this up instead of a Mac Studio ?
 
DGXSpark-fresh.jpg


Still early-days. It's a spiffy little machine.

Had the Reserve for the dual-unit, but release day happened without a whisper, so I bought mine from MC.

I really haven't had the opportunity to run it through proper loops.

My M2 Max Studio is still extremely performant . . . I am waiting for M5/6 Studio Max/Ultra before I think about replacing it.
 
  • Like
Reactions: Flowstates
@splifingate mind sharing your workflow ? I have the invoice drafted for two, still on the edge after the reviews outlining weak inference.

As of yet, I don't have a 'workflow': this is just a learning (to understand the mechanisms and paradigms) machine for me.

Initially, I had decided on getting two DGX's (rather than investing in a RTX Pro 6000).

Though I am missing-out on the higher-end networking features (and by-virtue, all the aspects of learning available in this space) by going solo, I have concluded that one is enough for me to learn the majority of what I thought I might need to know.

Depending on your previous experience(s), you may find that the DGX is not gana pull-ahead in inference. Think of it as a whole-package learning machine that excels at more baseline levels. While not particularly exemplary in one (or another) area, it provides a well-rounded, performant environment.

For these (and a few more) I have decided to let my dual-setup reservation lapse.

I can always get another (or more/other), in the future ;)
 
As of yet, I don't have a 'workflow': this is just a learning (to understand the mechanisms and paradigms) machine for me.

Initially, I had decided on getting two DGX's (rather than investing in a RTX Pro 6000).

Though I am missing-out on the higher-end networking features (and by-virtue, all the aspects of learning available in this space) by going solo, I have concluded that one is enough for me to learn the majority of what I thought I might need to know.

Depending on your previous experience(s), you may find that the DGX is not gana pull-ahead in inference. Think of it as a whole-package learning machine that excels at more baseline levels. While not particularly exemplary in one (or another) area, it provides a well-rounded, performant environment.

For these (and a few more) I have decided to let my dual-setup reservation lapse.

I can always get another (or more/other), in the future ;)

Thank you for sharing. I feel more at ease letting the reservation lapse when reading the you did the same. Saving up for the M5 studio and getting the DGX next calendar year then.
 
  • Like
Reactions: splifingate
Thank you for sharing. I feel more at ease letting the reservation lapse when reading the you did the same. Saving up for the M5 studio and getting the DGX next calendar year then.

Yes; already using the DGX (on my desk), I shed absolutely no grief when the "Exciting news—the NVIDIA DGX Spark Bundle you reserved (Reservation #17xxx) is now available for purchase." email arrived.

Part of the excitement was that I would have these items in my hand(s) before general release.

Ultimately, I don't really GAS. The lack of promptness is, however, telling.

You may not really need the DGX, but the CUDA integration might tip that scale ;)

Full-Stack seemed important for my introduction into the Arena. I may have more to share after I find the space to work with it more.
 
  • Love
Reactions: Flowstates
You may not really need the DGX, but the CUDA integration might tip that scale ;)

Full-Stack seemed important for my introduction into the Arena. I may have more to share after I find the space to work with it more.

I have a few things I want to achieve with VLMS and point clouds, we'll see when I get to it. The to-do list is daunting.

Would love to hear more from your experience.
 
  • Like
Reactions: splifingate
I have a few things I want to achieve with VLMS and point clouds, we'll see when I get to it. The to-do list is daunting.

Would love to hear more from your experience.

I'm currently teaching FT, working-through my MAT, and being the GC for my house reno(s) . . . this leaves little time for my inner geek to get out ;)

I will share what I can (when I can).
 
I'm currently teaching FT, working-through my MAT, and being the GC for my house reno(s) . . . this leaves little time for my inner geek to get out ;)

I will share what I can (when I can).

No worries, life is a trip. Good luck on whatever endeavors hide behind the acronyms.
 
  • Love
Reactions: splifingate
Nuff said, are there any MLheads in the forum and potential insights on picking this up instead of a Mac Studio ?
Considering both right now. Difference in ram vs models able to run is what I am working through now. Hate to say looking at M3 Ultra from China (scary...)
I am making a bit of a spreadsheet on which (x)billion parameters run on (machine RAM limit). Then going to decide "do I really need to run the 120b vs the 70b?
 
  • Wow
Reactions: Flowstates
I am making a bit of a spreadsheet on which (x)billion parameters run on (machine RAM limit). Then going to decide "do I really need to run the 120b vs the 70b?
I’ve been doing a lot of that myself recently. I’ve downloaded a lot of models over the months since I got my Studio, and not all models are the same.

Running the larger parameters models is more of a “because you can” rather than “because they’re better”.

In general terms, the sweet spot seems to be around 30B models. But your mileage may vary depending on your project.
 
Running the larger parameters models is more of a “because you can” rather than “because they’re better”.
This is what I am wondering. Only reason to M3 Ultra 512GB would be for deepseek-r1 650b, qwen3:235b, or similar, at this time.
 
This is what I am wondering. Only reason to M3 Ultra 512GB would be for deepseek-r1 650b, qwen3:235b, or similar, at this time.
Another reason for more RAM is if you want to run more than one AI at the same time.

Example:

I use LLM to help with my story project. One experiment I played around with was to have a python script send a story scene to an LLM to get it to create and image generation prompt for the key moment of the scene. My script would then send that prompt to an AI image gen (A1111) to generate the image, which would then be inserted in the scene file - kind of automatically illustrating the story.

Depending on model size, you still don’t need a huge amount of memory for this, but the more you have the more you can experiment with larger models.

The largest LLMs I’ve run on their own have been over 100GB. Some really large models are quantised heavily to make this work, so the practicality doesn’t really live up to the capability.

Even if you had 512GB, I suspect one day you’d still be wishing for more.

There’s a lot of work being done to make large capable models work on lesser RAM. Experimentation is key to find out what works best for you. For me, I'm experimenting with training local LLMs. To train larger models, you need lots of RAM.

My recommendation is to not go too far over your budget today because you’ll probably be wanting to upgrade in 3 years time anyway.
 
  • Like
Reactions: rb2112
This is what I am wondering. Only reason to M3 Ultra 512GB would be for deepseek-r1 650b, qwen3:235b, or similar, at this time.
One additional thought that may sway you is context length. Many LLMs these days have large context lengths, so can handle larger documents or longer chats without losing track of things. Context length requires RAM.

Example:

I can load up Grok-2 on my 128GB M4 Max Mac Studio, but it takes up 112GB RAM (IQ3_XS). It has a context length of 128K tokens but, realistically, I can only use about 8K of them (maybe a little more).

Elsewhere, the model qwen2.5-7b-rrp-1m has a 1M context length but, as it only takes up 4.46GB RAM, I can - in theory - use all 1 million tokens.
 
  • Like
Reactions: rb2112
Another reason for more RAM is if you want to run more than one AI at the same time.
Funny, that was my exact thought waking up this morning. I realized that more RAM would get me the headroom to run simultaneous tasks.
I agree not to overbuy since in 3 years, better will be available. Are you happy with the 128GB? Do you think you will be able to really put it to use for the next 3 years without feeling that somehow your work is really suffering?
 
Funny, that was my exact thought waking up this morning. I realized that more RAM would get me the headroom to run simultaneous tasks.
I agree not to overbuy since in 3 years, better will be available. Are you happy with the 128GB? Do you think you will be able to really put it to use for the next 3 years without feeling that somehow your work is really suffering?
Fortunately, my use-case for the Studio is more hobby-oriented. I love to learn and push tech (remembering back to a ZX81 with a 16K RAM pack that I just *had* to push to its limits!). The “restrictions” of 128GB aren’t really making my workflow “suffer” because there are many alternatives using less RAM.

Yes, I’d love more. Example:

Image 27-10-2025 at 4.12 pm.png

But, right now in 2025, “more” doesn’t mean “better”.

I think we need to see much higher memory bandwidth before we’re going to be satisfied with more RAM. The Max has 546GB/s memory bandwidth - the Ultra has 819GB/s memory bandwidth - and both are pretty slow compared to the Nvidea H100 with 3.35 TB/s (or so I’ve read - I have zero experience of anything like that).

In a few years time, it won’t just be the RAM size that we’ll be chasing, but the speed. Spending £15K on a 512GB M3 Ultra, then filling it up with a massive LLM, only for it to dawdle along at 819GB/s memory bandwidth wouldn’t be fun, I think.

So we need to restrain ourselves and limit our expectations. Do as much as we can with what we can afford today, but be ready to upgrade before we think we’ll need to.
 

Apparently all is not well in DGX Spark-land... Anyone here with a unit able to corroborate Carmack's claims?
 

Apparently all is not well in DGX Spark-land... Anyone here with a unit able to corroborate Carmack's claims?

I can't speak for the power-draw, but I can definitely say that all (three) re-boots have been initiated by me :)
 
  • Like
Reactions: diamond.g
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.