ES version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
40% Positive
Analyzed from 502 words in the discussion.
Trending Topics
#models#model#specific#india#healthcare#domain#cerebras#thing#country#agriculture

Discussion (10 Comments)Read Original on HackerNews
Oh no not this again. Using domain specific models for critical things like healthcare is probably the worst thing you can do. There's this stupid notion that you can just lean into AI slightly without committing and you can pay 10% and get 10% of capabilities - like just for healthcare, just for agriculture. That's not how it works.
I hope they don't cheap out and force industries to use some cute domain model on this Cerebras thing - this is the last thing India needs. India should partner with proper AI companies instead of half-assing here. If I could do something about it, I would recommend go all in on proper data centers and encourage hosting companies (that may host open models like Deepseek) as well as OpenAI/Anthropic to get their models here.
I also did rough maths on the throughput these chips support - 64 cerebras chips support around 500 RPS which is pretty low and insignificant IMO.
> There's this stupid notion that you can just lean into AI slightly without committing
I don't see anything in this article that suggests how much they're going to commit to it. Domain specific models work great at specific things, that's the whole point of fine tunes.
In any case this is completely unavoidable. AI is now munitions, in addition to an economic driver/tool. Depending on another country for your defense, economic development, governmental data/compute, etc is a non-starter for any developed nation. They will all eventually have sovereign AI, it's a certainty. It will probably take 20 years for most nations to get there due to hardware shortages but the genie is out of the bottle.
What's your evidence for this claim?
In short: there’s a reason OpenAI doesn’t have health care model, chemistry model, mathematics model etc. If they want a smaller model they nerf all domains together like in GPT nano. Why? It’s because intelligence compounds and if you take away capability of mathematics from a model, you remove capabilities in all dimensions.
That being said India’s built from scratch (not just finetuned) sovereign LLM Sarvam is actually the right direction. I’ve played around with the 26b parameter model and it’s pretty solid.