ES version is available. Content is displayed in original English for accuracy.
Related: AI's biggest critic has lost the plot - https://news.ycombinator.com/item?id=47934353
ES version is available. Content is displayed in original English for accuracy.
Discussion Sentiment
Analyzed from 5513 words in the discussion.
Trending Topics
Discussion (115 Comments)Read Original on HackerNews
It seems he realizes he was wrong about that and has pivoted slowly to, "well, maybe they work sometimes, but the cost isn't justified." Which is a reasonable question! I just find his style of never admitting when he is wrong off putting and the way he presents things as absolute fact, when he's guessing like the rest of us. He was right about a lot, wrong about a lot, it's okay to admit that, I don't think his fan base would care.
Which is to say, it's easy to scapegoat this guy, but I think his approach is not any different from other "opinion piece" bloggers that we all tend to reshare.
[0] - https://www.reddit.com/r/BetterOffline/comments/1p5zv33/why_...
We need better critics of the industry.
1) Nearer-term investment returns on AI businesses and data center build-outs.
2) Claims that LLMs are now (or soon will) rapidly displace most/all senior positions in certain high-skill professions (eg software engineering, music/film making, etc), leading to less overall jobs for those kinds of workers and mass unemployment.
3) The "Foom" overnight takeoff hypothesis that AI will soon be able to iteratively sustain substantial self-improvement directly yielding profound new fundamental capabilities across infinite generations with no human involvement.
I've never thought that AI isn't already quite useful for some things today, or that no investors will ever make money on AI, or that AI won't displace some workers in some types of jobs, or that using AI isn't already helping accelerate the development of AI. Just that there's been a lot of hype, exaggeration and over-estimation about how much impact, how soon and how broad. There will be a few instances of rapid, large impacts but the majority of it will be slower, more gradual and less disruptive than extreme predictions - and many of the most over-the-top predictions may not ever happen. Not because they can't happen but probably for more mundane economic, logistic and human-factors reasons along the lines of why we're no closer today to the 1950s visions of a flying car in every driveway.
Let's be fair here, the endgame is not "a few hundred bucks a month." Not for how much money has been invested. How much extra you have to spend to make developers how much more productive, and will companies go along with it is the trillion dollar question.
Over a few centuries better tools and technology made it so that <5% of the population in rich countries are farmers. They use tools like million dollar harvesters.
How many tokens can you realistically burn through in one chat session? Opus and many other frontier models do maybe 60tok/s, less 250k/hr out. In you can use more, but in most cases cache is 5-10:1 cheaper than new input. Say you average 500ktok in, 90% cache, per request. That amounts to 100-150ktok in new input-equivalent costs, which in most cases is ~20-30ktok in output-equivalent costs. Do a request every minute, that's a total of about 1.5-2Mtok/hr. At API prices that's $50/hr for Opus, but really it probably only costs Anthropic $10/hr to serve that.
That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.
So first party providers are not in a horrifying position or anything from a subsidization standpoint. The people in bad shape are Cursor and Perplexity, who don't have frontier models and are dependent on the open source community, which is typicly 6-12 months behind the frontier. They have to pay full freight API costs at 80% margin for the big boys to serve their harnesses, which is indeed untenable, and they'll have to either force users to use open source models and/or in house models they can serve at-cost or they will have to charge vastly more.
Gemini, Claude, and ChatGPT first-party services like Antigravity, Codex, and Claude Code are not in serious trouble though.
This all becomes extremely visible when trying to do agentic coding with local language models - you quickly realize that controlling context length and model size is just as important as avoiding wasted effort. The real scam is not AI Q&A ala ChatGPT, that's actually quite viable - though marginally less so as conversations grow longer. It's agentic coding with SOTA models and huge contexts.
You can look at: https://sebastianraschka.com/llm-architecture-gallery/ and see how much things have changed.
I've used single digit billions in a couple days, FWIW.
This seems to be the lynchpin of your argument.
It makes me wonder if I have been living under a rock, because I have never heard of frontier labs making money. AFAIK all AI firms are simply burning money to acquire customers at this stage. Is this wrong?
You're confusing the profit from the marginal token and overall profit (basically gross margin and operating margin). The comment you're replying to is calculating that AI labs are probably making a substantial profit per paid token. It's just that so far that profit has not been able to overcome the ongoing R&D and capex costs.
And the cost of not-quite-paid tokens.
I do not understand how the companies can end up in positive, unless something fundamental changes
do you think per token prices will go up or down in the long term? will the price per task trend down or up?
what about the price of human labor?
Prices going up or down depends on what labs decide and what users demand. Strong models being profitable at lower prices than what frontier labs offer is a fact.
What seems to actually be happening for white collar workers is that the price they can charge for their labor is dropping, but the price of their expenses (housing, food, gas) continues to rise.
Nobody including the connected article is making the argument that this cannot be profitable ever. People are saying "there is no way this admittedly quite interesting tool is going to be able to make back all of this money" and I think they are completely right to say that.
You can absolutely make money with this stuff, just not at this scale. The buildout for this shit has been certifiably crazy and a number of the involved firms are overleveraged for tens and even hundreds of billions of dollars.
How in the sweet fuck are you paying that off, plus giving investors dividends, selling this at $15/hour/user??? That math does not math. A quick google says there are between 1.5 and 4.4 million developers in the US alone, let's say it's 5 million, to be generous, and each of them is subbed to this for 8 hours per day, continuously. That's 600 million per year in revenue. If you took ALL that revenue, and put it towards paying down this debt, not leaving any for employee salaries, upkeep, ongoing development, it would take DECADES to pay down what OpenAI already owes.
And yes I'm sticking directly to code, because that's the only thing I've seen it be really good at. Are we really proposing that every knowledge worker on earth and every manager of such workers is going to have an autonomous agent running all the time!? To do what, make sure they don't have to read or write email? Which even just that example is bringing in a fucking mess of legal, compliance, and security violations because LLMs are not intelligent and are not capable of being properly secured.
Like I'm sorry, I cannot take this industry seriously when even the most basic back-of-napkin math is saying, nay, screaming from the rooftops that they are FUCKED.
That math is not mathing. $15/hour/user, with 5M devs, 8hrs and 240 working days per year that is 144B in revenue.
Of course people don't work every day, but even with European-level holidays that number is off by a factor of 240 or so.
That still feels incredibly optimistic given how split the community at large seems to be about how good this tech is, and it assumes all those developers also all work for firms large enough to pay for all of that.
However we are still very much in back of napkin math. We haven't even gone into what it costs to provide these services, how much it's going to cost yet for all these datacenters to be built, how much electricity and water they're going to rip through, their own employees and basic overhead, and all the rest. So IMO, we've now elevated it from "hopeless" to "this could work if a whole lot of other things line up really well."
According to your math, that's $600 million per day
I just don't think that LLM business models can survive the allure of advertising dollars, any more than Search could, or TV, or Radio, or Movies. Ignoring the talk of copilot putting ads into pull requests, there is just no way that publicly hosted LLMs will not end up inserting ads into the output.
This looks like what I remember. https://freakonomics.com/podcast/is-google-getting-worse/
More seriously for software engineering it’ll just cost a lot.
> On an economic basis, a monthly subscription only makes sense with relatively static costs.
Running a data center is a fixed expense. Whether or not people use that data center to it's capacity doesn't change how much the operator pays (electricity use factors into this, since a GPU running at 100% will use more watts than an idle one, but it doesn't move the needle much on other fixed and variable costs of a data center).
> They also assumed, I imagine, that the cost of tokens would come down over time, versus what actually happened — while prices for some models might have come down, newer “reasoning” models burn way more tokens, which means the cost of inference has, somehow, gotten higher over time.
This is backwards. When the cost of something goes down, people use it more. This is basic supply and demand. Inference has gotten cheaper already, and will continue to do so.
Companies subsidizing costs for growth happens all the time. Yes, switching to usage-based pricing instead of subscriptions sucks for customers, but enterprises will continue to pay.
I wonder what the rough costs of a data center look like over the lifetime of one GPU generation?
10% building
60% GPU
30% power
I haven't gone looking for that information, but I haven't run across it either.
Doubtless some people will reduce usage as a result. But Ed seems to find the idea that a 10 man developer team might spend 80K a year on tokens ridiculous. I don't understand this. Has he seen how much developers are paid? If you get a 20% productivity boost from coding agents, then that's two developers for 80K - effectively very good value.
Where things could go wrong is in comparison to cheaper models. If it's 5K a year for Qwen, and it's 2/3 as good will you pay 75K extra for Opus? Perhaps not.
I pray this happens soon, but I feel I've been hearing some version of it for a while.
This tech has uses. It has quite a lot of them in fact. However there is no usage of ChatGPT or Claude that makes OpenAI or Anthropic worth anything fucking close to what they're valued at right now, and both firms are scrambling to figure out how to get down from the top of the AI house of cards without detonating in the process.
Meanwhile DeepSeek is coming out with more capable models that run on far less onerous hardware and with far less compute requirements that does basically exactly what the vast majority of users actually want it to do.
This is going to be a financial bloodbath. Not for anyone actually responsible for it, of course, they'll be fine. It'll be everyone else getting soaked which is the only reason I give two shits.
Also, I didn't read this whole thing, but I have yet to see Zitron respond to the strongest AI financials claim, which is that the models themselves are profitable on a life-cycle basis, even if the companies are not profitable on an annual basis due to capital expenditure. Dario made this claim exactly, and it more or less blows all of Zitron's financials arguments up.
He does in this [0] article.
[0] https://www.wheresyoured.at/ai-is-really-weird/
The TL;DR is that Dario likes to talk about imaginary/hypothetical companies a lot in interviews, and those companies' financials don't have a direct basis in reality.
Until they file an S1 to go public and show the world the books, take everything they say with a grain of salt. The amount of financial engineering going on in this space is astounding, and I'll believe it when I see an objective 3rd party release an audit confirming this claim.
Is this an actual issue aside from people letting their autonomous agents run overnight?
> Don't attribute to malice what can be attributed to incompetence.
We're currently used to SAAS billing models that are either all-you-can-eat subscriptions, or metered around some easy-to-understand metric like # of users, or otherwise number of gigabytes consumed.
The SAAS economics work that way because the compute consumed is typically too cheap to meter. Some customer uses a little more than average, some customer uses a little less than average; it's not worth the time to even it out to the penny.
AI is so darn CPU (GPU? AIPU?) intense that will only be profitable, and affordable, if it can be metered like electricity and billed with a small margin.
In SAAS, we're not used to metering billing computations this way.
They went from GPT 2 a text only, goldfish-esque memory at a 8th grade reading level to what we have today, GPT 5, multimodality + a token window encompassing a enclyopedia and a Doctorate/Masters level of mastery in major subjects.
The economics are probably betting on this exponential growth to continue, which if it fails, the cash would burn.
EZ might have incautiously and incorrectly called the peak several times, but his newsletter is nearly always stacked with citations and insights that, at least to my cursory but frequent inspection, pan out.
His argument(s) have evolved over time, but what of it? That just shows he's not the dogmatist the author wants him to be. Discourse evolves, get over it.
2026 Zitron has a good sense of the scale at which AI is requiring enormous financial complexity and volume to realize, and his basic point is that it isn't sustainable in the medium term.
He is self-evidently correct.
I disagree. It really reads as conclusion is fixed argument change as they are disproven.
I'm sorry but telling me that this is what AI can do is a sad state of affairs. Like this is google level stuff.
It's interesting to compare it to electricity. Basically Anthropic was selling a flat fee electricity subscription, and when someone started connecting expensive washing machines (OpenClaw) to their subscriptions, instead of changing the pricing model, they banned washing machines...
I wonder if we will get to "electricity" style pricing for AI. What makes electricity predictable is relatively constant average usage over time + price is manageable. I'm just not buying electrical house heating and manage my electricity spending within some bounds.
With AI the problem is that we are only now getting to useful AI, and for now it's still too expensive to be useful, so they subsidize until they can stabilize at "cheap enough and smart enough" level. But it feels like that's still 2 years away while they are stopping to subsidize now. Will be interesting.
No? It was flat, but with ambiguously stated limits (eg. 5x, 10x 20x). They were discriminating on how the "electricity" was used, but that's not that much different than how power companies have different rates for residential users vs industrial users.
The internet seems to be saying that 70%+ of Anthropic revenue is per-token metered API, which would largely invalidate the article, but I can't find a solid source.
Customer: “I don’t want to pay more than $100/mo for my website” Developer: “What are your goals?” Customer: “1M daily visits, 1,000 monthly signups.”
And we've spent the past 25 years offering serverless compute, auto-scaling, pay-as-you-go for AWS and Internet infrastructure. And the economics are still a hard sell.
I am not sure if you would call claude code "an auto loop", but you don't need to be running something crazy like gas town to spend a lot of tokens with Claude.
If something is cheaper than alternatives, spending patterns change. People subsidize corn or power and so consumers alter behavior to take advantage of those prices.
1) They're lying
2) Status signalling
Because, comparing vs GPUs
~16k–17k tokens/second per user
<1ms latency
10x power efficiency
20x cheaper production
Model to Si ~ 60 to 90 days
We have every reason to believe SW_to_Si will facilitate improving economics
[0]: https://www.wheresyoured.at/why-are-we-still-doing-this/
A $20 subscription 2 years ago is not providing the same level of intelligence you're getting today.
Every major lab knows open source models are 6 months behind (See Google's "We have no moat") and none of them plan to make money on inference. Companies are subsidizing users to create moats that persist when models are essentially free for most everyday use.
That subscription was then and is now likely still subsidized.
I soured on him when he could not calculate cumulative revenue on an exponential curve, ignored everyone who showed him how to calculate it, and then kept writing that Anthropic’s revenue numbers are fake based on his inability to do math.
It’s too bad because any heavily hyped industry needs good critics (think Ida Tarbell to Rockefeller) but they should be honest critics, and he’s not, which really undermines not only his but others’ criticism of the industry.
Economics Don't Make Sense.
I mean, seriously... our current late-stage capitalist economy is the chaotic sloshing of excess capital or inverted debt in a shallow tub within which clumsy giants are stamping like toddlers, and a parasitic kleptocratic oligarch class balances its efforts biting the toddler ankles in hope of more stamping judged advantageous, and, bagging what water they can.
The truth is that the AI companies are gambling that inference cost will continue following a hyper version of Moore's Law, e.g. Google TurboQuant.
The countervailing thesis is that frontier models are consuming more and more compute.
The deepest truth: you often don't need a frontier model to get commercially acceptable results from AI. Thus, bring on the true pricing! and I'll just switch models to something financially sustainable.