RU version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
100% Positive
Analyzed from 260 words in the discussion.
Trending Topics
#chips#admission#bottleneck#memory#dedicated#inference#shifted#scaling#different#google

Discussion (10 Comments)Read Original on HackerNews
With the expected scale of inference, it makes cost sense to make dedicated hardware for each task if the workloads are even slightly different. Probably similar to the video decoding chips in TVs not being very cheap/efficient compared to chips capable of encoding video.
The funny thing about scaling laws is that as soon as they were known, the whole objective became learning how to break them - bending the curve, at least. They provided an incredibly useful target, but 'law' was a bit too strong a word.
There's no admission - this has always been known.
I would love to see an instruction set reference for one of these, all you have is hardware architectural diagrams or high level APIs.