Back to News
Advertisement
Advertisement

⚡ Community Insights

Discussion Sentiment

75% Positive

Analyzed from 461 words in the discussion.

Trending Topics

#amd#strix#halo#ram#need#enough#don#finally#rocm#due

Discussion (7 Comments)Read Original on HackerNews

zamadatix35 minutes ago
An underrated factor of the Strix Halo is it's also just a "normal" x86 PC. If all you want to do is run AI then these CPU+RAM based solutions tend to take a ton of $ to get lackluster performance. It does work, but you have to really know it's how it's going to work ahead of time. No driver/OS/ARM-vs-x86 compatibility to worry about hacking to get it to do anything else if you don't need only local LLM 100% of the time - just a typical small workstation.
data-ottawaabout 1 hour ago
I have the Framework Desktop with 395+ 128gb RAM

Today I am pretty happy with it. LLMs are finally good enough (fast enough with MTP+MoE, but also just much better in capability) that I can fit local ones into real tasks, and I've used image generation with invokeAI to do some genuinely useful things like rendering concepts for a renovation.

I mostly use lemonade-server and invokeAI for my workloads, previously I used llama-swap, but lemonade is just an easier to manage system. ROCm is finally usable.

Up until end of Q1 2026 it felt like a total waste of money largely due to AMD. ROCm was unusable all of last yera; there was an entire month where PyTorch crashed just trying to multiply two matrices due to AMD Linux driver issues. kyuz0's toolboxes were the only way to do anything really on the machine.

Thankfully things are in a good state now, finally.

I probably actually only need ~64gb of ram. There aren't a ton of high parameter count MoE with a small enough active set that it feels nice to use. But it is nice I can have many models or different modalities in memory at the same time, which is what the LMX Omni "models" do.

The numbers in the article for gptOSS feel a little irrelevant now. Prompt processing is definitely an issue, and diffusion is very very slow. PP speed hits hard you if you run an agent and try to compact context. Realistically most files are not large enough that it's a huge deal, but it does make large-scale agentic work slow.

throwa356262about 2 hours ago
I believe Strix Halo can do much better than these numbers.

The first Spark sw update improved performance significantly. Maybe AMD software team can get their act together and do the same? :)

vanillaxabout 2 hours ago
Ive been looking for a bench like this thanks for sharing!
AlotOfReadingabout 2 hours ago
I've been pretty disappointed with how horrifically memory-bound the spark is. By all rights it feels like it should blow strix halo and apple hardware out of the water, but it's completely hobbled by the low memory bandwidth.
nixosbestosabout 2 hours ago
Wow the timing, I spent a few hours trying to get my head around these three choices last night. Got to roughly similar conclusions. And I still don't know what to do.

I have more money than sense. I don't even know (yet) how silly it is, but maybe a Strix Halo, with 2x 5090s with p2p pcie patch? I'd do more 5090s but the power consumption and need to water cool is too much for me.

I'm also itchy because it seems like Ryzen Pro 495 is coming soon with even more unified RAM... (thoughts very much appreciate on any of this...)

data-ottawaabout 1 hour ago
AMD ROCm has come a long long long ways since last year, but you'll probably be happier not dealing with AMD's software.

I posted another comment with my experience with AMD 395+, I am overall happy and it's usable now, but it's only useful for models under 64gb of vram due to the active parameter counts on larger MoEs.

If you add 2x 5090s, do you actually need the base system?