HI version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
75% Positive
Analyzed from 461 words in the discussion.
Trending Topics
#amd#strix#halo#ram#need#enough#don#finally#rocm#due

Discussion (7 Comments)Read Original on HackerNews
Today I am pretty happy with it. LLMs are finally good enough (fast enough with MTP+MoE, but also just much better in capability) that I can fit local ones into real tasks, and I've used image generation with invokeAI to do some genuinely useful things like rendering concepts for a renovation.
I mostly use lemonade-server and invokeAI for my workloads, previously I used llama-swap, but lemonade is just an easier to manage system. ROCm is finally usable.
Up until end of Q1 2026 it felt like a total waste of money largely due to AMD. ROCm was unusable all of last yera; there was an entire month where PyTorch crashed just trying to multiply two matrices due to AMD Linux driver issues. kyuz0's toolboxes were the only way to do anything really on the machine.
Thankfully things are in a good state now, finally.
I probably actually only need ~64gb of ram. There aren't a ton of high parameter count MoE with a small enough active set that it feels nice to use. But it is nice I can have many models or different modalities in memory at the same time, which is what the LMX Omni "models" do.
The numbers in the article for gptOSS feel a little irrelevant now. Prompt processing is definitely an issue, and diffusion is very very slow. PP speed hits hard you if you run an agent and try to compact context. Realistically most files are not large enough that it's a huge deal, but it does make large-scale agentic work slow.
The first Spark sw update improved performance significantly. Maybe AMD software team can get their act together and do the same? :)
I have more money than sense. I don't even know (yet) how silly it is, but maybe a Strix Halo, with 2x 5090s with p2p pcie patch? I'd do more 5090s but the power consumption and need to water cool is too much for me.
I'm also itchy because it seems like Ryzen Pro 495 is coming soon with even more unified RAM... (thoughts very much appreciate on any of this...)
I posted another comment with my experience with AMD 395+, I am overall happy and it's usable now, but it's only useful for models under 64gb of vram due to the active parameter counts on larger MoEs.
If you add 2x 5090s, do you actually need the base system?