ES version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
44% Positive
Analyzed from 3406 words in the discussion.
Trending Topics
#meta#token#pdf#don#tokens#leaderboard#more#claude#why#usage

Discussion (115 Comments)Read Original on HackerNews
IMO claude, chatgpt/codex, etc should be able to optimize the PDF use case to be extremely token efficient as it's a very obvious use case. But when I start to explain to my wife/friends why it burns through so much quota, I find myself thinking "why should they have to understand this aspect of it". to me, that the details of PDF parsing and extracting are relevant to users (instead of solved such that you don't have to pay attention to it) shows how these tools are not nearly as "ready" as they are made out to be. I may be preaching to the choir on this one, but just my 2c
Source; my last job working with accessibility and that nightmare.
Gemma 4 works perfectly well offline on limited hardware (I have an 8GB video card) and can handle extracting text from image-based PDFs just fine.
Take a PDF -> run it through MarkItDown [1], using the OCR plugin if you need (point it to Gemma 4) -> now you can ask Gemma 4 questions about the (markdown) document.
I am sure Gemma 4 could even create a GUI to make this process very simple for a non technical user.
[1] https://github.com/microsoft/markitdown
This and replies to this are surreal. It's like everyone simultaneously decided to forget that you don't need claude or whatever to read a PDF. The document is literally made for you to read...
This workflow is highly optimized.
This discussion was about measures, goals and incentives. Follow the incentives.
You can rack up token consumption extremely quickly when you embed LLMs into automated processes or products.
I'd be very surprised if these numbers are just typical coding usage with no scripting/pipeline/automation stuff
Using AI to suddenly deliver massive amounts of code without questioning the requirements
Who could possibly have predicted that happening?
Predictably, everyone started talking in Slack like their jobs depended on it. Everyone was responding to everything. Instead of writing out a complete message and pressing enter, they'd send each fragment of the sentence as a new line.
The Slack leaderboard was never shown again. Unfortunately the habit remained because people were afraid they were going to be secretly judged by how much Slack activity they generated.
I expect the same thing is going to happen at companies who had token leaderboards. Once you've instilled that fear in people, they internalize the expectation.
Insanity
It's surprising how often this principle is applicable.
No amount of "this isn't used for anything" will change that. It's inherent in human nature in the 21st century to believe any and all metrics will be used against them, and therefore must be gamed.
It's why you also have to set UNBELIEVABLY clear goals and have incentives tied to those goals. Incentives meaning money. If you want to measure things, measure them. But have clear, consistent, and meaningful goals tied to bonuses or something if you want a thing done correctly.
The answer is simpler on the surface: focus.
Generally the problem is the larger the firm’s operations, the harder it is to focus.
Apple is the only firm that has done well on this consistently and doesn’t have a huge grave yard of failures to show for it.
But yeah, it's like they've never actually met human beings...
> Oh wow! If I paid for this myself I would have spent a lot of money! Are other people spending as much as me? I’m going to create a leaderboard!
> Oh no, my misinformed manager is using the leaderboard as a slight of hand for work. I need to game this now.
Then the leaderboard is banned… I can’t see how this ever really goes up the chain beyond director.
Charles Goodhart :-)
He started being drastically more serious into AI in 2022, and 2023 and he has nothing to show for it.
Heck, he could have rented GPUs the way Elon did at this point and either mended the bleeding or stopped it, not sure how many he has, but it beats losing this badly.
If he doesn't wake up and learn how to business, I suspect he will lose his empire he's built up for himself.
"Meta building cloud business to sell excess AI capacity, Bloomberg News reports Meta building cloud business to sell excess AI capacity, Bloomberg News reports"
https://www.reuters.com/business/meta-sell-excess-ai-computi...
Everyone except the executives who get paid millions to predict exactly that.
It's a hard job, someone has to not pay consequences for bad decisions.
people who make it to managers tend to have bozo tendencies & are yes men.
before it was lines of code, Jira tickets closed. Now it's tokens spent.
Enjoy it while you can, because it won’t last forever. Per-token billing is quite eye opening in terms of how much it can cost
Just wonder what happens when more and more companies introduce similar restrictions. Will that lead to devaluations of the LLM companies?
How?
This is an org pushing thousands of PRs a day. How do you solve the attribution problem for any one engineer's work given some set of impact metrics?
And keep in mind, most common impact metrics are trailing indicators, often over relative long time horizons.
It wants to see faster R&D, higher revenues from existing assets, greater operating margins, higher sales to invested capital ratio and so on…
The best way to measure that for a software firm is up-time of services, usage and project completion duration
If so, your metric cannot distinguish between a bad engineer and a good one.
If not, you have the same problem you started with: measuring contributions to “uptime”.
This is also not easy. In particular proactively preventing bugs is not rewarded
The main way I think you can proactively prevent bugs in a meaningful way is by crafting and propagating better architecture.
Better (or worse) architecture and adoption of it can be measured through a mix of quantitative and qualitative means so those metrics could be used to evaluate the impact of the engineer driving that architecture.
The engineer who haphazardly launched on Friday then promptly saved the team at 3am and worked the weekends gets the promotion, while the one who prevented a bug from happening "didn't get anything done" and gets the PIP.
When shit just works for months or years no one is going to come and praise you for stuff you did a while back.
You are better off breaking stuff and then fixing them to show how useful you are.
I could believe it, but I'd want to see something a little more concrete.
The subscriptions are for personal use not enterprise.
i.e. [1] "This article is about paid Max plans for individual consumers. If you're part of an organization looking to use Claude with your team, refer to Team and Enterprise Plans."
[1]: https://support.claude.com/en/articles/11049741-what-is-the-...
Meta sounds like a cluster-F of a place to work. Massive reorgs around wild ideas like the metaverse and everything Ai all the time. Employees terrified of being fired. Incentivizing token spending and then cutting it off. While the overall company may be fine, the dev department sounds rudderless and absolutely miserable.
Just a pristine comment section yap.
The times I’ve been asked to evaluate a prospective candidate and I see that product on their résumé, it’s been an instant veto, in the same category as working at Palantir.
Having a speed limit does not imply the utility of driving is zero.
it's not that difficult to say it confidently if you use any of their services and applications because exactly nothing has changed.
For reference most labor productivity increases for the last 50 years amounted to about 2% per year. If a hypothetical FB engineer had doubled their productivity with their gazillion tokens that would be 30 years of productivity gains in one year. I'd wager the evidence would be quite evident if you opened any of their apps
employees consumed 73.7 trillion tokens in roughly 30 days
, a figure tracked on an internal leaderboard called "Claudeonomics" — a reference to Anthropic's Claude, one of the third-party AI tools widely used inside the company [2]. The leaderboard, which ranked employees and teams by token consumption, inadvertently incentivized usage volume over productive output.
Meta plans to dismantle the leaderboard and replace it with a centralized monitoring platform called "AI Gateway," which will track usage and spending across teams in real time [2]."
This seems to be an interesting upcoming business, that is:
Helping companies centralize and track their AI usage by employee.
Anyway, great article!
I'd argue most of the AI value is related to how 'Dead' the internet is.
Ultimately the spend on tokens has to benefit the firm financially or it won’t continue spending on it.
"Now he tells me!"
bonk
Various discussions:
Meta’s chaotic AI strategy
https://news.ycombinator.com/item?id=48523271
Companies rein in AI usage as costs strain budgets
https://news.ycombinator.com/item?id=48602571
Meta CTO Andrew Bosworth Admits the Company's AI Reorg Was 'Atrocious'
https://news.ycombinator.com/item?id=48548461
Tokenmaxxing is dead, long live tokenmaxxing
https://news.ycombinator.com/item?id=48708795