DE version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
100% Positive
Analyzed from 926 words in the discussion.
Trending Topics
#model#more#power#local#costs#data#run#hardware#models#datacenter

Discussion (12 Comments)Read Original on HackerNews
This is such an odd and illogical conclusion. If a smaller model can be sufficient (which is not something I would have said), that smaller model can be ran in a datacenter. The idea that a small model running at home is 'sipping' while that same small model in a datacenter is 'slurping' is absurd. The datacenter will have much greater overall efficiency in both power usage and total cost to implement. Of course if you compare a small home model to a DC frontier model the power usage is different, but so is the output.
There are huge hidden costs in datacenter prices that are simply unnecessary for most casual users of compute. Salaries of staff to maintain datacenters, redundancy and high availability of nine 9s that are simply not required by most customers, as well as real estate costs are all non-existent costs in a homelab setup because those are living costs you pay for anyway, with or without a home server.
This quickly breaks down when you're talking about large models that needs terabytes of memory to run[1]. There's no way that you're going to be able to amortize that for a single person.
[1] https://apxml.com/models/glm-51
I agree that data centres will be set up to be more efficient, but we're also going to need fewer of them if local LLMs take off. If that's true, overbuilding data centres is more revenue pressure for AI companies.
The economy of scale that data centers have is actually a good thing economically and environmentally for many kinds of demand.
I think that the most capable models will continue to be in high demand across the market until at least "a datacenter of PhDs" level of capability. At that point I can see a transition to more local model use if affordable consumer hardware is available (for the median human on Earth). If that turns out to be true then the hyperscaling will plateau at the level allowing sustained commercial/industrial "PhD"-level demand which we aren't at yet (all providers are still struggling to meet current demands).
All that said I actually don’t think that matters much. I think we are dragging attention economy concepts in to ai responses, and it doesn’t matter. Both options saved me hours per week, and the difference between 3 and 1 minute may not be worth the additional cost.
Also there are times when the model output is much better with anthropic, but it’s not all the time. I think it becomes a question should we be using the best model for all questions?
And I wonder whether then subscription model is just a way to create a demand for API. For example, I’m building this portal with the support of an LLM for coding, but then I will need to have an LLM using API token to run the platform giving them additional revenue, a demand that did not exist without the coding I did with the subscription.
I get it you may not work in this industry or know the workings of how an AI company seeking frontier AGI WOULD operate but its helpful in connecting ideas and concepts by adding a proposed solution if for nothing more than to show the direction of your thinking.
Sure some people may talk smack about your idea but I've learned that the difference between someone who complains for the argument of complaints and those who complains to fix things have different forms of thinking. The latter may be wrong but its an indicator of HOW that person thinks which is always valuable.
Thanks for the blog.