RU version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
76% Positive
Analyzed from 2272 words in the discussion.
Trending Topics
#memory#more#ram#demand#companies#openai#money#going#same#years

Discussion (55 Comments)Read Original on HackerNews
It doesn't stop there though. OpenAI is currently mired in a capital crunch. Their last round just about sucked all the dry powder out of the private markets. Folks are now starting to ask difficult questions about their burn rate and revenue. It is increasingly looking like they might not commit to the purchase order they made which kick-started this whole panic over RAM.
Soo ... how sure are we that the memory makers themselves are not going to be the ones holding the bag?
If they could make this stuff and sell it to regular people a decade ago for very palatable prices, why do they come up with the idea that this is the technology of the gods, unaffordable by mere mortals?
That is, memory capacity is reserved for datacenters yet to be built, but this will do weird things if said datacenter construction is postponed or cancelled altogether.
There’s virtually infinite capital: if needed, more can be reallocated from the federal government (funded with debt), from public companies (funded with people’s retirement funds), from people’s pockets via wealth redistribution upwards, from offshore investment.
They will be allowed to strangle any part of the supply chain they want.
Another point is I often see the money argument - like country X has more money, so they can afford to do more and better R&D, make more stuff.
This stuff comes out of factories, that need to be built, the machinery procured, engineers trained and hired.
> more can be reallocated from the federal government (funded with debt)
While this is the most reliable funding, it's still not very accessible. OpenAI is a money pit, and their demands are growing quickly. The US government has started a bunch of very expensive spending. If OpenAI were to require yearly bundles of it's recent "$120B" deal, that's 6% of the US' discretionary budget. 12.5% of the non-military discretionary budget. (And the military is going to ask for a lot more money this year) Even the idea of just issuing more debt is dubious because they're going to want to do that to pay for the wars that are rapidly spiralling out of control.
None of this is saying that the US government can't or wouldn't pay for it, but it's non trivial and it's unclear how much Altman can threaten the US government "give me a trillion dollars or the economy explodes" without consequences.
Further deficit-spending isn't without it's risks for the US government either. Interests rates are already creeping up, and a careless explosion of deficit may well trigger a debt crisis.
> from public companies (funded with people’s retirement funds)
This would be at great cost. OpenAI would need to open up about it's financial performance to go public itself. With it's CFO being put on what is effectively Administrative Leave for pushing against going public, we can assume the financials are so catastrophic an IPO might bomb and take the company down with it. Nobody's going to be investing privately in a company that has no public takers.
Getting money through other companies is also running into limits. Big Tech has deep pockets but they've already started slowing down, switching to debt to finance AI investment, and similarly are increasingly pressured by their own shareholders to show results.
> from people’s pockets via wealth redistribution upwards
The practical mechanism of this is "AI companies raise their prices". That might also just crash the bubble if demand evaporates. For all the hype, the productivity benefit hasn't really shown up in economy-wide aggregates. The moment AI becomes "expensive", all the casual users will drop it. And the non-casual users are likely to follow. The idea of "AI tokens" as a job perk is cute, but exceedingly few are going to accept lower salary in order to use AI at their job.
There's simply not much money to take out of people's pockets these days, with how high cost of living has gotten.
> from offshore investment.
This is a pretty good source of money. The wealthy Arabian oil states have very deep slush funds, extensively investing in AI to get ties to US businesses and in the hope of diversifying their resource economies.
...
...
"Was". Was a good source of money.
The real issue is everyone wanting to upgrade to hbm, ddr5, and nvme5 at the same time.
We aren't. The remaining memory manufacturers fear getting caught in a "pork cycle" yet again - that is why there's only the three large ones left anyway.
Oh no!
Given that TurboQuant results in a 6x reduction in memory usage for KV caches and up to 8x boost in speed, this optimization is already showing up in llama.cpp, enabling significantly bigger contexts without having to run a smaller model to fit it all in memory.
Some people thought it might significantly improve the RAM situation, though I remain a bit skeptical - the demand is probably still larger than the reduction turboquant brings.
[0] https://news.ycombinator.com/item?id=47513475
Current "TurboQuant" implementations are about 3.8X-4.9X on compression (w/ the higher end taking some significant hits of GSM8K performance) and with about 80-100% baseline speed (no improvement, regression): https://github.com/vllm-project/vllm/pull/38479
For those not paying attention, it's probably worth sending this and ongoing discussion for vLLM https://github.com/vllm-project/vllm/issues/38171 and llama.cpp through your summarizer of choice - TurboQuant is fine, but not a magic bullet. Personally, I've been experimenting with DMS and I think it has a lot more promise and can be stacked with various quantization schemes.
The biggest savings in kvcache though is in improved model architecture. Gemma 4's SWA/global hybrid saves up to 10X kvcache, MLA/DSA (the latter that helps solve global attention compute) does as well, and using linear, SSM layers saves even more.
None of these reduce memory demand (Jevon's paradox, etc), though. Looking at my coding tools, I'm using about 10-15B cached tokens/mo currently (was 5-8B a couple months ago) and while I think I'm probably above average on the curve, I don't consider myself doing anything especially crazy and this year, between mainstream developers, and more and more agents, I don't think there's really any limit to the number of tokens that people will want to consume.
For example Gemma 4 32B, which you can run on an off-the-shelf laptop, is around the same or even higher intelligence level as the SOTA models from 2 years ago (e.g. gpt-4o). Probably by the time memory prices come down we will have something as smart as Opus 4.7 that can be run locally.
Bigger models of course have more embedded knowledge, but just knowing that they should make a tool call to do a web search can bypass a lot of that.
That is the sad reality of the future of memory.
Given the current tech, I also doubt there will be practical uses and I hope we’ll see the opposite of what I wrote. But given the current industry, I fully trust them so somehow fill their hardware.
Market history shows us than when the cost of something goes down, we do more with the same amount, not the same thing with less. But I deeply hope to be wrong here and the memory market will relax.
I hate to mention Jevons paradox as it has become cliche by now, but this is a textbook such scenario
[0] https://techwireasia.com/2026/04/chinese-memory-chips-ymtc-c...
Assuming China takes TSMC in one piece (unlikely without internal sabotage in the best case scenario), it would still probably take years before it produces another high end GPU or CPU.
We would probably be stuck with the existing inventory of equipment for a long time…
The risk with China taking over Taiwan is that they mostly expedite their own production research by a couple of years.
The lawsuits in the past prove that statement to not be basically but actually.
We have RAM shortage now, we will have very cheap RAM tomorrow. It’s not like production is bottlenecked by raw materials. Chip companies just need to assess if the demand by AI companies will last so it’s better to scale up, or perhaps they should wait it out instead of oversupplying and cutting into their profits.
Think I will scrap my PC and sell its parts.
I wonder if there are any niche companies building decent rigs with DDR3 and 5/6th generation Intel CPUs out there, it is cheap and might be a business opportunity?