FR version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
75% Positive
Analyzed from 341 words in the discussion.
Trending Topics
#deepseek#model#https#run#com#experts#models#sonnet#quality#tokens

Discussion (14 Comments)Read Original on HackerNews
[0] https://news.ycombinator.com/item?id=47864835
And the active parameters come from the experts. For each token the model picks some experts to run the pass (usually 2 to 4, I haven't read V4's papers). It's not always the same experts.
OTOH, being DeepSeek, I foresee a bunch of V4 distilled FP8 models fitting in a 5090 with tiny batches and with performance close from 75 to 85% of V4. And this might be good enough for many everyday tasks.
Today is a good day for open models. Thank god for DeepSeek.
"Pro" $3.48 / 1M output tokens vs $4.40 for GLM 5.1 or $4.00 for Kimi K2.6
"Flash" is only $0.28 / 1M and seems quite competent
(EDIT: Note that if you hit the setting that opencode etc hit (deepseek-chat / deepseek-reasoner) for DeepSeek API, it appears to be "flash".)