HI version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
75% Positive
Analyzed from 5875 words in the discussion.
Trending Topics
#more#amazon#models#data#anthropic#companies#don#money#need#aws

Discussion (158 Comments)Read Original on HackerNews
Are Anthropic and OpenAI rushing to IPO for immediate cash so they can delay the inevitable? Surely this cycle of robbing Peter to pay Paul to pay John to pay Tim must end.
We are only just now getting a taste of the “true cost” of these tokens. Then there is a lack of compute bottlenecking everything. Even now I’m looking at the 7.5x rate of tokens for Opus 4.7
Open models are promising and cost a fraction of what they proprietary models cost which the big two are vulnerable to when companies start to feel the cost of tokens.
Will data centres be built fast enough and powered sufficiently to lower the cost of compute thus tokens?
Is it just a giant Hail Mary to get to AGI ASAP before the economy collapses?
Above all else, I simply feel the models have plateaued. I am noticing productivity loss for tasks I deem as “complex”
This reads to me like Anthropic anticipating demand and making a commitment to acquire supply. Not unlike airlines committing to future jet fuel purchases, or Apple committing to future DRAM volume.
At the current price or real price? Anthropic said a $200 subscription can cost them $5000 so the real price could be anywhere from 10-30x the current price.
In short: per-token charges currently cover maybe 1% of the total costs in this field. To pay ongoing costs, and pay back investors, everyone will need to pay 100x or 1000x the current rates, likely for decades.
Has there been a ton of hype? Absolutely but the value proposition is getting more and more tangible.
Did some of the AI companies over commit in spending? I am sure and they will probably hurt in the long term. I thought Anthropic had been scaling towards profitability at a quick timeline though.
I think this can keep going for at least another 5 years.
In a system of open-ended growth, yes, you can point to how long the system has persisted as evidence of its longevity. But in a system of plateauing growth, the system's age is an indicator of how close it may be to death. I suspect that the model that permitted the "success" of Uber and Tesla is nearing the end of its lifetime.
Anthropic are scared of open weight models and need to fear-monger towards you to continue paying for their models.
That's the whole point of their 'safety' marketing narrative, account bans, and Dario being the AI scarecrow scaremongering everyone about nonsense like 'Mythos' towards the world.
'Mythos' is already here in the form of open-weight models that also found the same vulnerabilities as Anthropic did.
It feels like these hyperscalers are just raising as much as they can giving extremely rosy projections becauses these sooner or later peak is going to be reached (if that hasn’t happened already)
What does "on time" mean? You'll need to negotiate with local authorities, some friendly, some not. Data centers aren't exactly popular neighbors these days. Then negotiate with the local power utility. Fingers crossed the political landscape doesn't shift and your CEO doesn't sign a contract with an army using your product to pick bombing targets, because you'll watch those permits evaporate fast.
Then there's sourcing: CPUs, GPUs, memory, networking. You need all of it. Did you know the lead time for an industrial power transformer is 5+ years? Don't get me started on the water treatment pumps and filters you can't even get permitted without. What will you do in the meantime ? You surely aren't gonna get preferential treatment from AWS / Google / ... if they know you are moving away anyway. Your competition will.
The risk and complexity are just too big. AI/LLM is already an incredibly complex and brittle environment with huge competition. Getting distracted building data centers isn't enticing for these companies, it's a death sentence.
You're not wrong about the rest but no AI company would ever build a data center in every continent for this, even if they were prepared to build data centers. AI inference isn't like general purpose hosting.
Large data centers consume as much power as a small city. The location decision is about being able to connect to a power grid that is ready to supply that.
Evaporative cooling also needs steady water supply. There are data centers which don’t operate on evaporative cooling but it’s more equipment intensive and expensive.
Latency doesn’t matter. You can get fast enough internet connected to these sites much more easily than finding power.
* data transit across the world can be very slow when there's network issues (a fiber is cut somewhere, congestion, bgp does it's thing, etc). having something more local can mitigate this
* several countries right now have demented leaders with idiotic cult-like followers. Best not to put all your eggs in those baskets.
* wars, earthquakes, fires, floods, and severe weather rarely affect the whole planet at once, but can have rippling effects across a continent.
And frankly, the real question isn't "why spread out the DCs?", its "what reason is there to put them close to each other?".
Every single argument you've brought up is irrelevant in the face of billions of dollars. If you intend to consume $100 billion dollars in data center infrastructure, you're going to find a way to accomplish it while cutting out the middlemen.
Meanwhile if you're flaky and never intend to spend that money, you're going to come up with a way to pay someone else to deal with those problems and quit paying the moment they don't.
You'd never do both at the same time. You'd never commit your money and give them control over your business critical infrastructure.
Hence the deal is a sham. The $100 billion are a lie. Thank you for telling us.
You can’t even get the hardware at that scale without months or years of order lead time. NVidia doesn’t have warehouses full of compute hardware waiting for someone to come get it.
They also reused an existing building. Basically, they put 100,000 GPUs into a building and attached the necessary infrastructure in about half a year. Impressive, but it’s not the same as a $10B/year data center usage commitment like this deal.
Colossus initially had ~200k GPUs. 100B buys you ~1 million high end GPUs running 24/7 for a year at AWS retail prices.
If Anthropic/OpenAI miss projections, infra providers can somewhat likely still turn around and sell it to the next guy or use it themselves. If they have more demand than expected (as Anthropic currently does), vcs will throw money at them and they can outbid the competition
If they built it themselves and missed projections it's a much more expensive mistake
It's just risk sharing. Infra providers take some of the risk and some of the upside
Not if their pricing comes with multiyear commitments for reserved pricing. No doubt they get a huge volume discount but the advertised AWS reserved pricing is already enough for pay for a whole 8x HX00 pod plus the NVIDIA enterprise license plus the staff to manage it after only a one year commitment. On-demand pricing is significantly more expensive so they’re going to be boxed in by errors in capacity planning anyway (as has been happening the last few months).
The economics here are absurd unless you’re involved in a giant circular investment scheme to pump up valuations.
It’s common even for smaller companies to do mutually beneficial business with each other. It’s actually helpful to do business with people who are also your customers because you have a relationship with them and you also have leverage: They are extra incentivized to treat you well because they don’t want to upset any of the other business you have with them.
Isn't that almost all that matters when comparing doing something yourself versus paying someone else, in this case Amazon, to do it for you?
> The Anthropic deal specifically covers Trainium2 through Trainium4 chips, even though Trainium4 chips are not currently available. The latest chip, Trainium3, was released in December. On top of that, Anthropic has secured the option to buy capacity on future Amazon chips as they become available.
However there are certain advantages like supply chain that only established companies would have access to. This is also a commitment to spend upto 100B on internal approach and research. I would expect them to come up with their own cpu chip and device design. This will shift the focus to an internal approach. And might make amazon give better prices later down the line
If you’re not sure it’s going to blow the socks off, foisting capital investment on partners is a great deal.
See the difference in companies/franchises that always own the land/building and those that always lease.
Why this versus us being in a temporary bottleneck? Like, railroads became expensive to build everywhere in the 19th century not because we reached Earth's capacity for railroads or whatever, but because we were still tooling up the industry needed to produce them at higher scales.
Just a guess.
I do think a ton of businesses would benefit from running their own hardware, but they're not getting five billion dollars to stay on the cloud.
Everybody does right now, right?
But: is it your core competency?
Can your firm afford the distraction?
In the meantime if you work on revenue generating work, that side of PnL is uncapped. So you can either put some engineers on reducing your costs at most by 100% or, if they worked on product ideas they could be working on things that generate over 9000% more revenue.
Wonder if Anthropic is making a mistake by focusing on "consumer" hardware, and not going super specialized.
Comments like yours add nothing to the discussion.
You can throw money and hardware at a problem, but then someone may come along with a great idea and leapfrog you.
Just consider that all major AI providers now use deepseeks ideas for efficient training from that first paper.
edit: I misunderstood, I thought you were implying they designed their own GPUs. nevermind
I distinctly remember reading a big pantie twisting from Sam Altman and Co that Chinese took their stuff, the stuff OpenAI and Co spent billions to create, and used that as the base for $0.00
I mostly see their products as commodity at this point, with strong open source contenders.
Eventually it will become hard to justify the premium on these models.
Because, as OpenAI is learning [1], you still need to sell it. The tech giants have a seat at the table is mostly because they have distribution down.
[1] https://www.cnbc.com/2026/02/23/open-ai-consulting-accenture...
Now if "fully caught up" means today's level of intelligence is available for free in two years, by then that level of intelligence means very little
The only thing I can see them meaning is what you said, "in a minute the stragglers will be where the leaders were a minute ago", which, yeah, sure.
Play out a scenario. An open source model is released that is capable as Mythos. Presumably it requires hardware big enough that running it at home is unfeasible. You are imagining that individuals can run it in the cloud themselves for cheaper than api tokens would cost? Or even small companies? And that Anthropic and OpenAI won't be able to cut costs deeper than their competitors while staying profitable?
If it is fundamentally a commodity, that means "running it yourself" also isn't really interesting as a proposition. Many of the world's biggest companies sell commodities. It's a great business to be in if you can sell them cheaper than anyone else.
The value add here isn't the model, it is "having a bunch of compute and using it more efficiently than anyone else".
As the US sold weapons to many nations in the past, so will China, the US, France, etc sell AI cyber capability to other nations. Likely every modern nation will need some datacenter to host a cluster of the preferred vendor, as nobody's going to trust the US or China with their security.
it will be interesting to see it unfold
I have seen this argument made a lot, but llm serving being a commodity makes it _better_ for them not worse.
If it's a commodity, then you are entirely competing on price, and the players that will win on price will be the largest ones, because they can find efficiencies that smaller competitors won't have.
It's actually the small LLM companies that are in trouble if LLM serving commoditizes. They will need to distinguish themselves on features, because they can't compete on price. And even there the big labs will have an advantage.
[1] https://x.com/kenshii_ai/status/2046111873909891151/photo/2
Tokens will continue to increase in price until the supply meets the demand. That's going to take a while.
[0]: https://www.tomshardware.com/pc-components/gpus/datacenter-g...
[1]: https://www.cnbc.com/2025/11/14/ai-gpu-depreciation-coreweav...
This is completely not true if you use AWS Bedrock, and applies to both your private that or in a business context. Its one of their core arguments for the service use.
[1] - "...At Amazon, we don’t use your prompts and outputs to train or improve the underlying models in Amazon Bedrock and SageMaker JumpStart (including those from third parties), and humans won’t review them. Also, we don’t share your data with third-party model providers. Your data remains private to you within your AWS accounts..."
[1] - https://aws.amazon.com/blogs/security/securing-generative-ai...
The data isn't the sole point of them, they also are about bringing in users that will encourage the product use in companies and ultimately drive more profitable API adoption within their orgs, and just general diffuse mindshare doing the same.
You can still opt out (except with Google's offering which disables lots of features if you opt out of training).
Here is the thing nobody wants to say out loud or they are too dumb to realize. AI is intelligence, and intelligence has almost never been the binding constraint on productivity.
So you will get no productivity increase from the AI bubble. Yes, you read that correctly.
The test is simple, if raw brainpower were the bottleneck, you could 10x any company by hiring 200 PhDs. In practice you get 200 brilliant people writing unread memos, refactoring things that worked, and forming a committee to rename the committee. Smart has always been cheaper and more abundant than the discourse pretends.
Every real productivity revolution came from somewhere else like energy (steam, electricity), capital stock (machines that do the physical work), or coordination (railroads, shipping containers, the assembly line, the internet).
None of these raised the average IQ of the workforce, they changed what a given worker could move, reach, or coordinate with. Solow old line basically still holds. The output per worker grows when you give the worker better tools and infrastructure, not better neurons.
Meanwhile the actual bottlenecks in a modern firm are regulatory approval, legacy systems, procurement cycles, customer adoption, internal politics, and physical supply chains that don't care how clever your email was. A smart brains intern at every desk produces more artifacts, not more throughput, and in a lot of organizations, more artifacts is actively negative ROI.
Jevons does not save you either, cheaper cognition mostly means more slide decks, not more GDP.
So the setup is that models are commoditizing on one side, and on the other side a product whose core value add (more intelligence, faster) is aimed at a constraint that was never really binding. This of course a rough combo for a trillion dollar capex supercycle.
Fun for the trade, while it lasts, but there is no thesis. Just dont tell CNBC and short NVDA on time ,-)
Granted LLM's are not even PHDs.
What a weird time we live in...
There's also a very strong Trurl and Klapaucius [1] component to this AI craziness, as in I remember a passage in Lem's The Cyberiad where either Trurl or Klapaucius were "discussing" with an intelligent/AGI robot and asking it for stuff-to-know/information, at which point said AGI robot started literally inundating them with information, paper on top of paper on top of paper of information. At that point it doesn't even matter if that information is correct or smart or whatever, because by that point the very amount of said information has changed everything into a futile endeavour.
[1] https://en.wikipedia.org/wiki/The_Cyberiad
Exactly. We don't use the intelligence we already have! That seems to be the real problem with the "AGI" concept. Given such a capability, we'll just nerf it, gatekeep it, and/or bias it. There's no reason to think we'll actually use it to benefit humanity as a whole. It will be shaped into an instrument to enforce our prejudices.
> Eventually it will become hard to justify the premium on these models.
On the contrary, the model is the moat.
The model represents embodied capital expenditure in the form of training. Training is not free, and it is not a commodity, it is heavily influence by curation.
Eventually the ever-increasing training expense will reduce the competition to 2-3 participants running cutting edge inference. Nobody else will be able to afford the chips, watts, and warehouse. It's a physics problem - not a lack of will.
If you're a retail user, and a lower-tier model is suitable for your work, you'll have commodity LLM's to help you. Deprecated models running on tired silicon. Corporate surveillance and ad-injection.
But if you're working on high-stakes problems in real time, you're going to want the best money can buy, so you'll concentrate your spend on the cutting-edge products, open API's, a suite of performance monitoring tools and on-the-fly engineering support. And since the cutting edge is highly sought after, it's a seller's market. The cutting edge products buoyed by institutional spend will pull away from the pack. Their performance will far exceed what you're using, because your work isn't important. Hockey stick curve. Haves and Have-Nots.
The economic reality is predetermined by today's physical constraints - paradigm shifting breakthroughs in quantum computing and superconductors could change the calculus but, like atomic fusion power, don't count on it being soon.
I am waiting for that. Perhaps a taalas kind of high-performance custom hw coding llm engine paired with an open-source coding-agent. Priced like a high-end graphics card which would be pay off over time. It will be a replay of the ibm-mainframe to PC transition of a previous era.
Same, and I think we're close. "The original 1984 128k Mac model was $2,495, and the 1985 512k Mac was $2,795" [1]. That's $8 to 9 thousand today. About the price of a 32-core, 80-GPU M3 Ultra Mac Studio with 256 GB RAM.
[1] https://blog.codinghorror.com/a-lesson-in-apple-economics/
[2] https://www.bls.gov/data/inflation_calculator.htm
Anthropic gets access to limited compute resources and Amazon gets demand to justify increased R&D and capex + feedback from the best users in the field.
> Today’s agreement will quickly expand our available capacity, delivering meaningful compute in the next three months and nearly 1GW in total before the end of the year.
They need a bunch of compute, now.
https://www.anthropic.com/news/anthropic-amazon-compute
In my reading, Amazon is giving $5B of usage credits in exchange for shares. If Anthropic works out, it's a good deal for Amazon. If it doesn't, they lose on their invesment sheet, but they got ~ $5B in revenue, so it looks good on their operating sheet. And it helped justify a build out that they can sell to others.
For Anthropic, this lets them operate for more time without having to make numbers work. If Anthropic works out, they'll figure out the $100B commitment later. If it doesn't work out, it's not their problem.
It's probably faster to build up amazon's capacity with amazon's money than to build owned capacity with someone else's money at the scale they're looking to build out.
so basically ...
you could view this as a kind of discount, but instead of paying less later, you get some cash now and then pay full later.
"Claude I'm evaluating whether I should host my app on AWS or Google Cloud. Provide me with an analysis on my options." "After a detailed analysis, AWS is clearly your better option."
If I am correct (and I hope that I am wrong!) this will drastically increase the cost of building these new data centers.
https://www.aboutamazon.com/news/company-news/amazon-invests...
Yeah, totally not desperately seeking investment to keep the party going ...
Gemma4 being able to run on commodity hardware I think is the real win out of this. Pop the bubble. Settle the craziness and the claws. Let scientists and engineers tinker and improve in the background. Hopefully we can have GPUs be affordable for gaming again although I'm starting to think that will never happen.
I think when they rack up the RAM prices, they should pay for the damage they caused here. I don't need AI anywhere, but the increase in RAM prices is annoying me. Thankfully I purchased new RAM for a new computer, say, 3 years ago, so I can hold out for the most part - but sooner or later I have to purchase a new computer, and I really don't see why I should pay more, solely due to AI companies and greedy hardware manufacturers. Simple-minded capitalism does not work - I consider this a racket as well as collusion.