Nvidia was the first to announce the desktop AI solution called DGX Spark but AMD was the first to market with the Strix Halo. I was really excited when I heard about the DGX Spark and 128G usable vram, it wasn't until the AMD launched out of nowhere did I realize how disappointing these devices are. The DGX Spark was expected to be about 10-15% faster than the Strix Halo due to the faster ram, but reality is far from that.
These devices are using shared ram similar to what Apple does with the mac. So while the memory bandwidth is far higher than typical CPU only solutions, it is still considerably slower than modern GPUs. For example the Strix Halo memory bandwidth is rated at around 253GB/s and the 5090 is 1,792GB/s. The difference in speed explains why these devices are so much slower than pure GPU vram, but having 128G vram allows you to run far larger models.
After looking at the reviews for the DGX Spark, it's actually laughable how bad it is.
This is the same model I run on my Strix Halo, the Spark gets 94.67 tokens/sec for prompt processing and 11.66 tokens/sec for token generation. My current speeds right now without my nvidia 3090 hooked up is 793.50 tokens/sec prompt processing and 45.88 tokens/sec for token generation. Over 800% faster prompt processing and 400% faster token generation. The funny thing, is the Strix Halo is half the price of the Nvidia DGX Spark.
Current speeds with my Strix Halo
Current speeds with my Strix Halo & Nvidia 3090 hooked up via oculink
The speeds of the Spark look so bad I can't imagine it is really usable for anything. Unless they get those speeds up 200-400% I can't see it being usable even for testing.
My Frankenstein Strix Halo w/ Nvidia 3090.
50 tokens/sec is very usable and sufficient for testing and even some production use. I mostly use my Strix Halo for testing and experimenting, most of my production work is done through cloud API for performance reasons. When I can get my new project proof of concept working and show it is profitable, I will build a private AI solution at a much larger scale.
Nvidia images are pulled from Nvidia Website
😍
But sHaReHoLdEr VaLuE!
My little minisforum box should arrive today. While I won't be using it for AI, I am looking forward to the opportunity to learn proxmox and get some LXC running.
Okay... but as someone who was at Nvidia Headquarters today I feel like I have to play Devil's Advocate here just a little 😁
The AMD Strix doesn't have CUDA support which can be a deal breaker to A LOT of developers. And the Spark's 128GB RAM is available to both the CPU and GPU due to the Grace Blackwell architecture. I believe the Spark can handle up to 200B parameter models as well where the Strix can do up to 70B parameter FP16 models. Not to mention if you have 2 Sparks and
Now that being said, the "Founders Edition" of the Spark is $4,000 and the Strix is in the $2,000 range I think.
But other OEM's like ASUS are selling the Spark for around $3,000 so mileage may vary.
I paid $1800 for my Strix and I run a 120b q8 model at 50 tokens/sec. I have run Qwen3 235B as well.
Cuda is becoming less of a deal breaker with rocm improving rapidly.
I don’t take the Strix or the Spark seriously, in my opinion they are both toys.
well yeah, they are toys compared to Enterprise infrastructure. But both of those run on normal 120V power outlets so they can be used in everyday homes.
A DGX runs n 240V C19/C20 cables so not really an option for 'normal' people.
So get 2 Spark devices connected over their ConnectX-7 ports and you can run any 'consumer-grade' model.
And CUEA being less of a deal breaker is true but moving from 99% market share to 94% is less as well, doesn't mean that they all AI Enterprise developers still utilize CUDA :)
But enough Devil's Advocate, The spark is cool and is a super efficient tool to run consumer AI models but so is the AMD Strix.
Can you please explain why you keep downvoting my original content and you don't even downvote this you seen this 🧐
Please tell everyone you use Ai to help you write this post
And deal with this 🧐
Stop downvoting my original content and comments
Tell your boss blocktrades to deal with this 🤔