Back to News
Advertisement
Advertisement

⚡ Community Insights

Discussion Sentiment

78% Positive

Analyzed from 26079 words in the discussion.

Trending Topics

#deepseek#model#models#china#more#https#open#pro#opus#don

Discussion (1555 Comments)Read Original on HackerNews

hodgehog112 days ago
There are quite a few comments here about benchmark and coding performance. I would like to offer some opinions regarding its capacity for mathematics problems in an active research setting.

I have a collection of novel probability and statistics problems at the masters and PhD level with varying degrees of feasibility. My test suite involves running these problems through first (often with about 2-6 papers for context) and then requesting a rigorous proof as followup. Since the problems are pretty tough, there is no quantitative measure of performance here, I'm just judging based on how useful the output is toward outlining a solution that would hopefully become publishable.

Just prior to this model, Gemini led the pack, with GPT-5 as a close second. No other model came anywhere near these two (no, not even Claude). Gemini would sometimes have incredible insight for some of the harder problems (insightful guesses on relevant procedures are often most useful in research), but both of them tend to struggle with outlining a concrete proof in a single followup prompt. This DeepSeek V4 Pro with max thinking does remarkably well here. I'm not seeing the same level of insights in the first response as Gemini (closer to GPT-5), but it often gets much better in the followup, and the proofs can be _very_ impressive; nearly complete in several cases.

Given that both Gemini and DeepSeek also seem to lead on token performance, I'm guessing that might play a role in their capacity for these types of problems. It's probably more a matter of just how far they can get in a sensible computational budget.

Despite what the benchmarks seem to show, this feels like a huge step up for open-weight models. Bravo to the DeepSeek team!

segmondy2 days ago
They have had the best math models for about a year most folks just didn't know about it. You can't find inference on APIs, but I run these at home, this is also the advantage of open models.

https://huggingface.co/deepseek-ai/DeepSeek-Math-V2 https://huggingface.co/deepseek-ai/DeepSeek-Prover-V2-671B

simonjgreenabout 21 hours ago
You are of course specifically referring to the math optimised models, not the chat ones folks would generally encounter. Not that I’m trying to contradict you, your point is super valid and I agree with you! But I’m supplementing to help anyone following along who may make choices.

This is when it happened for anyone interested: https://binaryverseai.com/deepseek-math-v2-benchmarks-review...

jugabout 17 hours ago
Shouldn't one use e.g a Wolfram Alpha MCP endpoint for math in AI? From what I've seen on even premium non-quantized models, I would never ever trust the innate ability of a LLM to calculate.
lowbloodsugar1 day ago
You run a 671B model at home?
segmondy1 day ago
Yes, and plenty of others do too. Quantizied. Join us at r/localllama

My largest models

   318G    /llmzoo/models/Qwen3.5-397B
   377G    DeepSeekv3.2-nolight
   380G    /llmzoo/models/DeepSeek-V3.2-UD
   400G    /llmzoo/models/Qwen3.5-397B-Q8
   443G    DeepSeek-Math-v2
   443G    DeepSeek-V3-0324-Q5
   522G    /llmzoo/models/GLM5.1
   545G    /llmzoo/models/kimi2.6
   546G    /llmzoo/models/KimiK2.5
tclancy1 day ago
It's a big house.
UncleOxidant1 day ago
Maybe if there was a 1-bit quant.
verdverm2 days ago
Vertex AI has had deep seek available via API for a while
segmondy1 day ago
I'm talking about their specialized math models, not the general model.
PhilippGille2 days ago
When you say "Gemini", which exact model do you mean? You know there are several and they vary a lot in how capable they are? Pro 3.1 Preview, 2.5 Pro (their latest non-preview pro model), Flash 3 Preview, ...

Same with GPT-5: Latest 5.5, prior 5.4, or actually the original 5 (.0)?

You can't talk about model performance without specifying the exact model.

hodgehog111 day ago
My apologies, I thought it would be implicit that I am using the top-tier model of the time given the challenge of the tasks. GPT-5.5 was too new in this top comment (although I did test it a bit in a comment below), so I was using GPT-5.4. Gemini is Pro 3.1 Preview.
WarmWash1 day ago
High bet on 3.1 pro. I use it a lot for math and classic engineering, it's very strong.
ozgune2 days ago
I reviewed how DeepSeek V4-Pro, Kimi 2.6, Opus 4.6, and Opus 4.7 across the same AI benchmarks. All results are for Max editions, except for Kimi.

Summary: Opus 4.6 forms the baseline all three are trying to beat. DeepSeek V4-Pro roughly matches it across the board, Kimi K2.6 edges it on agentic/coding benchmarks, and Opus 4.7 surpasses it on nearly everything except web search.

DeepSeek V4-Pro Max shines in competitive coding benchmarks. However, it trails both Opus models on software engineering. Kimi K2.6 is remarkably competitive as an open-weight model. Its main weakness is in pure reasoning (GPQA, HMMT) where it trails Opus.

Speculation: The DeepSeek team wanted to come out with a model that surpassed proprietary ones. However, OpenAI dropped 5.4 and 5.5 and Anthropic released Opus 4.6 and 4.7. So they chose to just release V4 and iterate on it.

Basis for speculation? (i) The original reported timeline for the model was February. (ii) Their Hugging Face model card starts with "We present a preview version of DeepSeek-V4 series". (iii) V4 isn't multimodal yet (unlike the others) and their technical report states "We are also working on incorporating multimodal capabilities to our models."

solenoid09371 day ago
I feel like people suck at promoting Opus. Baseline, it's pretty on par with GPT 5.5.

But if you prompt it well - give it the reasoning behind why you're asking it to do something - it pulls far ahead.

hodgehog111 day ago
That's fine for procedural tasks, and I understand its value there. But these particular tasks I'm referring to occur on the front lines of research. You can't expect the prompts to be incredibly detailed, since those details are the whole challenge of the problem. I think there is value in having models that are capable of making really good preliminary insights to help guide the research.
cultofmetatronabout 22 hours ago
I really wanted to get excited about opus but in my own real world usage, I wasn't getting much out of it before hitting my limits. meanwhile i can abuse codex on 5.5 for hours getting a whole lot of work done. Plus, open code and PI are much more fun and interesting harnesses to work from than claude code imho.

I will however say that claude work and design are really great up until i blow its limit.

arcanemachinerabout 22 hours ago
Would love to know how GLM 5.1 stacks up in this ranking. Seems like it's on par with Kimi K2.6.
bbertelsen1 day ago
I'd be interested to know when that Opus 4.6 baseline is from given their recent recognition of performance issues. Do you have a paper posted on this review?
ozgune1 day ago
Ack. I took the benchmark results that AI Labs themselves published for their models. So the Opus 4.6 baseline would be from the time that Anthropic released the model.
lifty2 days ago
Wondering how gpt 5.5 is doing in your test. Happy to hear that DeepSeek has good performance in your test, because my experience seems to correlate with yours, for the coding problems I am working on. Claude doesn't seem to be so good if you stray away from writing http handlers (the modern web app stack in its various incarnations).
hodgehog112 days ago
Very cool to hear there is agreement with (probably quite challenging?) coding problems as well.

Just ran a couple of them through GPT 5.5, but this is a single attempt, so take any of this with a grain of salt. I'm on the Plus tier with memory off so each chat should have no memory of any other attempt (same goes for other models too).

It seems to be getting more of the impressive insights that Gemini got and doing so much faster, but I'm having a really hard time getting it to spit out a proper lengthy proof in a single prompt, as it loves its "summaries". For the random matrix theory problems, it also doesn't seem to adhere to the notation used in the documents I give it, which is a bit weird. My general impression at the moment is that it is probably on par with Gemini for the important stuff, and both are a bit better than DeepSeek.

I can't stress how much better these three models are than everything else though (at least in my type of math problems). Claude can't get anything nontrivial on any of the problems within ten (!!) minutes of thinking, so I have to shut it off before I run into usage limits. I have colleagues who love using Claude for tiny lemmas and things, so your mileage may vary, but it seems pretty bad at the hard stuff. Kimi and GLM are so vague as to be useless.

lifty2 days ago
My work is on a p2p database with quite weird constraints and complex and emergent interactions between peers. So it's more a system design problem than coding. Chatgpt 5.x has been helping me close the loop slowly while opus did help me initially a lot but later was missing many of the important details, leading to going in circles to some degree. Still remains to be seen if this whole endeavour will be successful with the current class of models.
wohoefabout 21 hours ago
Do you an idea of how well these models perform on set theory problems or more niche fields in mathematics? So the model would have to both understand a paper that’s not in its training data, and use this to write proofs.
giwook1 day ago
Doesn't the Plus tier not have access to their best (Pro) model?
chinadataabout 6 hours ago
Yes, DeepSeek can rely help save money.
alansaber2 days ago
Very interesting. I wonder how much of this is due to the context length. I am unclear on the implementation strategy, you ran this problem as a 1-shot using chat mode, or using each on an agent harness?
segmondy2 days ago
Has nothing to do with context length, they have experience training math models, they have a model that would take gold in IMO and a lean prover. Both have been out for almost a year.
dataviz1000about 21 hours ago
> there is no quantitative measure of performance here

Have them do multiplication or other complicated arithmetic. You say that isn't difficult. Then why do they burn 200k tokens in 20 minutes without converging? I did a deep exploration to help myself understand here [0].

[0] https://adamsohn.com/reliably-incorrect/

bnm042 days ago
Have you also tried the Pro versions of ChatGPT and Gemini (Deep Think)?
hodgehog111 day ago
Yes to both, I'm paying for them and use the top-tier thinking models.
nibbleyou2 days ago
Curious to know what kind of problems you are talking about here
hodgehog112 days ago
I don't want to give away too much due to anonymity reasons, but the problems are generally in the following areas (in order from hardest to easiest):

- One problem on using quantum mechanics and C*-algebra techniques for non-Markovian stochastic processes. The interchange between the physics and probability languages often trips the models up, so pretty much everything tends to fail here.

- Three problems in random matrix theory and free probability; these require strong combinatorial skills and a good understanding of novel definitions, requiring multiple papers for context.

- One problem in saddle-point approximation; I've just recently put together a manuscript for this one with a masters student, so it isn't trivial either, but does not require as much insight.

- One problem pertaining to bounds on integral probability metrics for time-series modelling.

MinimalAction2 days ago
Regarding the first problem: are you looking at NCP maps for non-Markovian processes given you mention C*-algebra? Or is it more of a continuous weak monitoring of a stochastic system that results in dynamics with memory effects?

I'd be very curious to know how any LLMs fare. I completely understand if you don't want to continue the discussion because of anonymity reasons.

pm2r2 days ago
It would be wonderful to have a deeper insight, but I understand that you can disclose your identity (I understand that you work in applied research field, right ? )
fuddle1 day ago
Any plans to publish the benchmark results?
hodgehog111 day ago
I have plans to publish the problems, not any plans to publish how well the LLMs perform on them. The standard for publishing benchmarks is very high, and I'm really just posting vibes here. Still, I hope my experiences are useful to some people, as others experiences have been useful to me.
throwa3562622 days ago
Seriously, why can't huge companies like OpenAI and Google produce documentation that is half this good??

https://api-docs.deepseek.com/guides/thinking_mode

No BS, just a concise description of exactly what I need to write my own agent.

u_sama2 days ago
I am very partial to Mistral's API docs https://docs.mistral.ai/api
eshack941 day ago
Agreed, they also have great documentation. There's something to be said for documentation that is so concise, well laid out, and immediately actionable for those looking to get started quickly.
lykr0n2 days ago
It's because they're optimizing for a different problem.

Western Models are optimizing to be used as an interchangeable product. Chinese models are being optimizing to be built upon.

Barbing2 days ago
>Western Models are optimizing to be used as an interchangeable product.

But so much investment in their platforms, not just their APIs?

raincole2 days ago
[flagged]
setr2 days ago
First you clone the API of the winner, because you want to siphon users from its install-base and offer de-risked switch over cost.

Now that you’re winning, others start cloning your API to siphon your users.

Now that you’re losing, you start cloning the current winner, who is probably a clone of your clone.

Highly competitive markets tend to normalize, because lock-in is a cost you can’t charge and remain competitive. The customer holds power here, not the supplier.

Thats also why everyone is trying to build into the less competitive spaces, where they could potentially moat. Tooling, certs, specialized training data, etc

hunter672 days ago
Our (western) economic model forces competing individual companies to be profitable quickly. China can ignore DeepSeek losing money, because they know developing DeepSeek will help China. Not every institution needs to be profitable.
FuckButtons2 days ago
yes, they want to win the same way they won more or less every other economic competition in the last 30 years, scale out, drop prices and asphyxiate the competition.
simonjgreen2 days ago
Yeah, it’s an interesting one. I think inertia and expectations at this point? I don’t think the big labs anticipated how low the model switching costs would be and how quickly their leads would be eroded (by each other and the upstarts)

They are developing their moats with the platform tooling around it right now though. Look at Anthropic with Routines and OpenAI with Agents. Drop that capability in to a business with loose controls and suddenly you have a very sticky product with high switching costs. Meanwhile if you stick with purely the ‘chat’ use cases, even Cowork and scheduled tasks, you maintain portability.

tick_tock_tick2 days ago
They are all racing to AGI. They aren't designing them to be interchangeable they just happen to be.
peepee19822 days ago
If you want other people to know whether you're being genuine or sarcastic, you'll have to put a bit more effort into your comments. Your comment just adds noise.
madduciabout 20 hours ago
For me, DeepSeek has been the best so far, in terms of coding skills, performance and documentation all together. Too bad this is flagged as 'concerning' when it comes to privacy, while on the other hand Gemini, ChatGPT and Claude are way beyond that, especially their mobile apps requiring a lot of permissions.
vitorgrs2 days ago
Meanwhile, they don't actually say which model you are running on Deepseek Chat website.
alansaber2 days ago
Because they produce revenue from products which abstract this away
Alifatisk2 days ago
You might enjoy Z.ais api docs aswell
kubb2 days ago
Western orgs have been captured by Silicon Valley style patrimonialism, and aren’t based on merit anymore.
kccqzy2 days ago
I spent only two minutes reading their documentation and it’s clear no one did any proofreading and it’s full of mistakes made by non-native speakers.

Example: the second sentence on the first page says “softwares” but “software” is a mass noun that cannot be pluralized.

Example: the third page about tokens has some zipped code to “calculate the token usage for your intput/output” and obviously “intput” should be “input” but misspelled.

As a company that produces LLMs, they could have even used their own LLM to edit their documentation to fix grammar issues, and yet they did not.

Maybe I’m just extra sensitive to grammar and spelling issues but this kind of lack of attention to detail is a huge subconscious turnoff. I had to fight my urge to close the tab.

Maxatar1 day ago
Yeah I think those details are the least of most peoples concerns. I can't vouch one way or another for DeepSeek's documentation but for me what matters most when reading documentation is being able to get the information I want efficiently, not whether someone spelled "software" as "softwares", which is a very common spelling in Asia as an FYI.

I read OpenAI or Anthropic's documentation nowadays and it's just so full of useless junk and self-congratulation that makes it a miserable experience to go through. It's a real shame because OpenAI used to write stellar documentation and publish really lucid papers just few years ago.

aprdm1 day ago
No one cares about this kind of stuff. 99% of the devs are not English native speakers, what do you expect ? It works and we all can understand it
kccqzy1 day ago
I try hard not to care but subconsciously spelling errors and grammar issues scream low-quality work to me. It’s the kind of mistake that’s the easiest to correct, and they didn’t bother.
amluto2 days ago
The tool calling Python example would have benefitted from actually parsing the tool call. As is, it explains almost nothing.
dackdel1 day ago
i dont think deepseek will ever recover from this. huge loss for them. they will stop the pursuit of agi cause of one hn user and a comma.
squirrellous1 day ago
This tells me a real developer wrote the docs, instead of someone with good English writing skills but is less technical.

> they could have even used their own LLM to edit their documentation to fix grammar issues

In my experience companies who do this rarely stop at using LLMs to fix grammar issues. It becomes full on LLM speak quite fast, especially if there isn’t a native English speaker in the room who can discern what’s good and bad writing.

replwoacause1 day ago
pedantry
slopinthebag1 day ago
i prefer it cuz it indicates they didnt use an LLM to write their documentations and that its human generated
jen201 day ago
> Example: the second sentence on the first page says “softwares” but “software” is a mass noun that cannot be pluralized.

I constantly see and hear this mistake from actual humans too.

It's fairly ironic that your own comment contains run-on sentences, speculative claims and phrasing peculiarities like "could have even" instead of "could even have". Perhaps you are less sensitive to this than you think!

angry_octet1 day ago
There is a difference between conversational speech and formal speech like documentation. It isn't rational to criticise use of the first when such speech is complaining about errors in the latter.

It's strange that you criticise "could have even" when it is a phrasing clearly being used for emphasis. "Could even have" makes no clearer sense in context.

No irony detected.

ChrisClark1 day ago
Nobody cares, we're talking about quality documentation here, not a couple spelling mistakes
orbital-decay2 days ago
>we implement end-to-end, bitwise batch-invariant, and deterministic kernels with minimal performance overhead

Pretty cool, I think they're the first to guarantee determinism with the fixed seed or at the temperature 0. Google came close but never guaranteed it AFAIK. DeepSeek show their roots - it may not strictly be a SotA model, but there's a ton of low-level optimizations nobody else pays attention to.

whatreason1 day ago
There have been others for sure, but I'm not sure who was first https://vllm-website-pdzeaspbm-inferact-inc.vercel.app/blog/...
oofbeyabout 17 hours ago
Nobody does it because it’s expensive. If you remove the requirement for perfect reproducibility you open the door to lots of optimizations. Most people prefer faster cheaper results over perfect reproducibility. When the model is intrinsically statistical the value of perfect reproducibility is … limited.
orbital-decayabout 15 hours ago
Yeah, of course. Making it cheap/compatible with heavy batching is exactly what they did, that's what I mean. ("with minimal performance overhead")
chenzhekl2 days ago
It's interesting that they mentioned in the release notes:

"Limited by the capacity of high-end computational resources, the current throughput of the Pro model remains constrained. We expect its pricing to decrease significantly once the Ascend 950 has been deployed into production."

https://api-docs.deepseek.com/zh-cn/news/news260424#api-%E8%...

XCSme2 days ago
Yup, I tried to benchmark it, but harder questions time out or get rate-limited...
nsoonhui2 days ago
Sorry, but exactly where in the article that you linked contains the mention of " Ascend 950"?
chenzhekl2 days ago
it's in the footnote text of the first figure of the section the link points to, where "昇腾950" means "Ascend 950"
nsoonhui2 days ago
OK, strange that it doesn't appear on my version of the webpage

https://api-docs.deepseek.com/zh-cn/news/news260424#api-%E8%...

This is the first figure of the section that the above links point to (https://api-docs.deepseek.com/zh-cn/img/v4-spec.png).

And I can read Chinese.

gertlabs1 day ago
Objective, detailed benchmark results at https://gertlabs.com

Early takeaways: from this release, DeepSeek V4 Flash is the model to pay attention to here. It's cheap, effective, and REALLY fast.

The Pro model is slow, not much better in coding reasoning so far when it works, and honestly too unreliable and rate limited to be of much use, currently. Hopefully that improves as new providers host the model. Flash is working fine, and is currently performing competitively with recent releases, but only on agentic workflows. Check back in 24 hours for full combined scoring with tool use and long context for both models.

Many of the frontier Chinese AI labs have released near-frontier models that are just a little bit behind Opus 4.6 in terms of speed, tool use ability, or long context handling. Open weights are winning the AI race, led by China. Crazy couple weeks of releases.

Mimo V2.5 Pro by Xiaomi (not open weights) is actually the best performer of the latest string of Chinese releases in our combined, comprehensive benchmarks, despite getting less attention. Kimi K2.6 is the most interesting open weights release, still. DeepSeek is not the leader in the space anymore.

An interesting pattern with the latest string of Chinese releases is the much better agentic boost (models are not as smart out of the box, but their ability to iterate in a loop with tools makes up most of the difference). Deepseek V4 Flash exemplifying this -- not a smart model on the first try, but it makes up for it over the course of a session.

Squarex1 day ago
I would say all benchmarks are inherently subjective. How is yours better? It seems to produce a little bit strange results. Opus 4.6 being worse than 4.5 for example. Or chinese models being rated too high. Kimi, Deepseek or GLM are all great in open source world, but I don't believe they are ahead of SOTA models from Anthropic, OpenAI or Google.
gertlabs1 day ago
No, some benchmarks are definitely objective, but most can be easily gamed. For example, most of the benchmarks on the model cards: they have measurable answers that don't rely on a human judge (a human made the question, but the answers are measuring some uncontroversial knowledge or capability). But because there is a single, correct answer, and those answer leak (or are randomly discovered and optimized for in training), they lose value over time, and regardless, they have a ceiling on the intelligence they can measure.

Others are purely subjective, like LMArena, which really only measures the personality and style preferences of the masses at this point, because frontier LLM technical answers are too hard for the average person to judge.

Then there are some interesting one-off benchmarks, but they lack enough rigor, breadth, and samples to draw larger conclusions from.

So we designed our benchmark with 3 goals: objective measurements (individual submissions not dependent on a human or LLM judge), no known correct answer (so simulations can scale to much higher levels of intelligence), and enough variety over important aspects of intelligence. We do this by running multiple models in cooperative/competitive environments with very complex action spaces and objective scoring, where model performance is relative and affected by the actions of other participants.

And yeah, there are some interesting results when you have a more objective benchmark. It should raise eyebrows when every single sub-release of every company's model is better across the board than its predecessor -- that isn't reality.

Squarex1 day ago
The word "objective" just seems too authoritative to me.
segmondy1 day ago
you are arguing with your belief instead of an objective truth. benchmark is more objective, if you don't agree with it, come up with a better one. but what you believe doesn't matter.
Squarex1 day ago
It was not a confrontational take. But all benchmarks are designed by humans, we are not that great at measuring intelligence. So it is somewhat subjective. I was just arguing with the word "objective". Not with the results per se.
tw1984about 21 hours ago
I agree that benchmarks are inherently subjective.

but the fact that you cite your brief as your main argument is funny - you don't even have any inherently subjective numbers to justify what you believe, you only have "I don't believe".

Squarexabout 14 hours ago
Sure, I have mixed up two things together. I don't think this benchmark is bad, I just did not like it is presented as the ultimate objective truth. The other thing I have mentioned is that it delivers different results from other benchmarks, so the "believe" stems from other benchmarks.
dandaka1 day ago
Interesting that you rate Claude Opus 4.6 lower than 4.5 and 4.7, while community consensus puts it on top.
nostrebored1 day ago
I think most hardcore people I know are still sticking with 4.5 for coding workflows
kamranjon1 day ago
I'm particularly interested in it being REALLY fast - do you have any rough tok/s numbers for the flash model? I'm excited for unsloth to drop some quants that I can try and run locally, but really curious how it's been performing speed wise. In general I actually over-index on speed over intelligence. I'd rather a model make mistakes quickly and correct in a follow-up than take forever to get a slightly better initial result.
gertlabs1 day ago
Take a look at the Time column in https://gertlabs.com/?mode=oneshot_coding -- this is the total time to complete a solution for a reasonably complex problem end-to-end (you would have to divide by avg submission size to estimate tok/s). It's fast in the sense that most of the smart, recent Chinese releases are quite slow, especially the DeepSeek Pro variant. Opus 4.7 is also quite fast.

If pure speed is most important for your use case, GPT-5.3 Chat is the fastest model we've tested and it's still reasonably smart. Not meant for agentic tool usage / long context, though.

So it might be more useful for business applications or non-engineering usage where you don't need exceptional intelligence, but it's useful to get fast, cheap responses.

Lord_Zero1 day ago
Why no mention of GPT-5.5?
gertlabs1 day ago
Waiting on public API release. Once it drops, results will be up within 24 hours.
gertlabs1 day ago
Results are up. GPT 5.5 is a beast.
revolvingthrow2 days ago
> pricing "Pro" $3.48 / 1M output tokens vs $4.40

I’d like somebody to explain to me how the endless comments of "bleeding edge labs are subsidizing the inference at an insane rate" make sense in light of a humongous model like v4 pro being $4 per 1M. I’d bet even the subscriptions are profitable, much less the API prices.

edit: $1.74/M input $3.48/M output on OpenRouter

schneehertz2 days ago
This price is high even because of the current shortage of inference cards available to DeepSeek; they claimed in their press release that once the Ascend 950 computing cards are launched in the second half of the year, the price of the Pro version will drop significantly
Bombthecat2 days ago
In six month deepseek won't be sota anymore und usage will be wayyyy down.
randomgermanguy2 days ago
Only comparing on SOTA scores (ignoring price etc.) is like choosing your daily-driver by looking at who makes the fastest sports-car...
2ndorderthought2 days ago
A huge proportion of those scores are gamed anyways. Use whatever works for you at the price and availability you can afford
Palmik2 days ago
Or there will be DSv4.1/2/3 ;)
Barbing2 days ago
Well, if they distilled once…
menzoic2 days ago
API prices may be profitable. Subscriptions may still be subsidized for power users. Free tiers almost certainly are. And frontier labs may be subsidizing overall business growth, training, product features, and peak capacity, even if a normal metered API call is profitable on marginal inference.
dannyw2 days ago
Research and training costs have to be amortized from somewhere; and labs are always training. I'm definitely keen for the financials when the two files for IPO though, it would be interesting to see; although I'm sure it won't be broken down much.
m00x2 days ago
They are profitable to opex costs, but not capex costs with the current depreciation schedules, though those are now edging higher than expected.
nl2 days ago
Amazingly, the current depreciation overestimates the retained value of GPUs.

In 2023, the depreciation schedule for H100s was 2 years, but they are still oversubscribed and generating signficant income.

Coreweve has upped their depreciation for GPUs to 6 years(!) now, which seems more realistic.

https://www.silicondata.com/blog/h100-rental-price-over-time

amunozo2 days ago
I was thinking the same. How can it be than other providers can offer third-party open source models with roughly the similar quality like this, Kimi K2.6 or GLM 5.1 for 10 times less the price? How can it be that GPT 5.5 is suddenly twice the price as GPT 5.4 while being faster? I don't believe that it's a bigger, more expensive model to run, it's just they're starting to raise up the prices because they can and their product is good (which is honest as long as they're transparent with it). Honestly the movement about subscription costing the company 20 times more than we're paying is just a PR movement to justify the price hike.
peepee19822 days ago
I'm pretty sure OpenAI and Anthropic are overpricing their token billed API usage mainly as an incentive to commit to get their subscriptions instead.
simonjgreen2 days ago
Anthropic recently dropped all inclusive use from new enterprise subscriptions, your seat sub gets you a seat with no usage. All usage is then charged at API rates. It’s like a worst of both worlds!
weird-eye-issue2 days ago
The target audience for the APIs is third party apps which are not compatible with the subscriptions.
adam_patarino2 days ago
Prices are not just hard cost of inference. Training costs are not equal. Chinese labs have cheaper access to large data centers. I also suspect they operate far more efficiently than orgs like openAI.
mirzap2 days ago
My thoughts exactly. I also believe that subscription services are profitable, and the talk about subsidies is just a way to extract higher profit margins from the API prices businesses pay.
Bombthecat2 days ago
Google stated a while back, that with tpus they are able to sell at cost / with profit.

Aka: everyone who uses Nvidia isn't selling at cost, because Nvidia is so expensive.

LinXitoW2 days ago
They got loans to buy inference hardware on the promise of potential AGI, or at least something approaching ASI, all leading to stupid amounts of profit for those investors.

We therefore cannot just look at inference costs directly, training is part of the pitch. Without the promises of continuous improvement and chasing the elusive AGI, money for investments for inference evaporates.

WarmWash1 day ago
Because you are comparing China to the US.

In China you need to appease state goals. In the US you need to appease investor goals.

China will keep funding them regardless of their income, because the goal is (ostensibly) a state AGI/ASI. In the US, the goal is an ROI which may or may not come with AGI/ASI.

They are different economies with different goals. We can look at past Chinese national projects and see that they are fine with burning $50 to get [social goal] that's worth $5.

ting01 day ago
This is nonsense. The real reason is because the US companies are scamming the public, as per usual.
vitorgrs2 days ago
And they actually say the prices will be "significantly" lower in second semester when Huawei 650 chips comes in.
raincole2 days ago
Insert always has been meme.

But seriously, it just stems from the fact some people want AI to go away. If you set your conclusion first, you can very easily derive any premise. AI must go away -> AI must be a bad business -> AI must be losing money.

louiereederson2 days ago
It is possible to question the sustainability of the AI buildout and not have a dogmatic position on AI development.

There are still major unanswered questions here. For instance, all of the incremental data capacity build out is going to businesses that have totally unknown LT unit economics and that today are burning obscene amounts of cash.

evilos1 day ago
The people who doubted the sustainability of dot com era bubbles were correct even though the tech was actually transformational. Personally I expect roughly the same outcome.
zarzavat2 days ago
Before the AI bubble that will burst any time now, there was the AI winter that would magically arrive before the models got good enough to rival humans.
jimmydoe2 days ago
They’ve also announced Pro price will further drop 2H26 once they have more HUAWEI chips.
masafej5362 days ago
Point taken but there isnt any western providers there yet. Power is cheaper in china.
3uler2 days ago
These models are open and there are tons of western providers offering it at comparable rates.
NitpickLawyer2 days ago
As this is a new arch with tons of optimisations, it'll take some time for inference engines to support it properly, and we'll see more 3rd party providers offer it. Once that settles we'll have a median price for an optimised 1.6T model, and can "guesstimate" from there what the big labs can reasonably serve for the same price. But yeah, it's been said for a while that big labs are ok on API costs. The only unknown is if subscriptions were profitable or not. They've all been reducing the limits lately it seems.
ithkuil2 days ago
Is there evidence that frontier models at anthropic, openai or google or whatnot are not using comparable optimizations to draw down their coats and that their markup is just higher because they can?
persedes2 days ago
not soooo much though. It's heavily subsidized for residential consumption, but industrial power rates are almost comparable to the US (depends on the state you go to etc).
ting01 day ago
They don't make sense, they're a lie that these AI companies keep spamming using bots so that useful idiots perpetuate it, so that they can keep draining us of money. Straight out of the Anthropic handbook. They've always been cheap to run. I wouldn't be surprised if Anthropic is running for <$1 for 1M/tok.
dminik2 days ago
I mean, not one "bleeding edge" lab has stated they are profitable. They don't publish financials aside from revenue. And in Anthropic's case, they fuck with pricing every week. Clearly something is wrong here.
npn2 days ago
you know, if you don't have to pay insane salary for your top engineers, and don't have to pay billions for internet shills to control the narrative, then all of the labs will be insane profitable.
crazylogger2 days ago
I haven't seen anyone claiming that API prices are subsidized.

At some point (from the very beginning till ~2025Q4) Claude Code's usage limit was so generous that you can get roughly $10~20 (API-price-equivalent) worth of usage out of a $20/mo Pro plan each day (2 * 5h window) - and for good reason, because LLM agentic coding is extremely token-heavy, people simply wouldn't return to Claude Code for the second time if provided usage wasn't generous or every prompt costs you $1. And then Codex started trying to poach Claude Code users by offering even greater limits and constantly resetting everyone's limit in recent months. The API price would have to be 30x operating cost to make this not a subsidy. That would be an extraordinary claim.

nl2 days ago
The claim that APIs are subsidized is very common.

eg:

Token prices are significantly subsidized and anyone that does any serious work with AI can tell you this.

https://news.ycombinator.com/item?id=47684887

(the claims don't make any sense, but they are widely held)

vessenes2 days ago
I’ll note that it’s common and dangerous, in that there’s a generation of engineers who are at risk of leading each-other astray as to the economics and therefore probability distribution of outcomes for some firms that will massively impact their careers.

I think I understand the major reasons for this meme, but I find it really worrying; there were lots of incorrect ‘it’s a bubble’ conversations here in 2012-2015, but I don’t think they had the pervasive nature and “obvious” conclusion that a whole generation of engineering talent should just, you know, leave.

Meanwhile I am hearing rational economic modeling from the companies selling inference; Jensen, (a polished promoter, I grant you) says it really well — token value is increasing radically, in that new models -> better quality, and therefore revenues and utilization are increasing, and therefore contrary to the popular financial and techbro modeling of 2023, things like A100s still cost quite a lot whether hourly or to purchase. (!) Basically the economic value is so strong that it has actually radically extended the life of hardware.

I just hate to imagine like half of the world’s (or US’s) engineering talent quitting, spending ten years afraid, or wrongly convinced of some ‘inevitable’ market outcome. Feels like it will be bad for people’s personal lives, and bad for progress simultaneously.

dannyw2 days ago
Yeah, subscriptions used to be extraordinarily generous. I miss those days, but the reinvigoration of open weight models is super exciting.

I'm still playing with the new Qwen3.6 35B and impressed, now DeepSeek v4 drops; with both base and instruction-tuned weights? There goes my weekend :P

Flavius2 days ago
It's because investors in OpenAI/Anthropic want to get their money back in 10 months, not in 10 years.
casey22 days ago
It's the decades of performance doesn't matter SV/web culture. I'd be surprised if over 1% of OpenAI/Anthropic staff know how any non-toy computer system works.
sekai2 days ago
> I’d like somebody to explain to me how the endless comments of "bleeding edge labs are subsidizing the inference at an insane rate" make sense in light of a humongous model like v4 pro being $4 per 1M. I’d bet even the subscriptions are profitable, much less the API prices.

One answer - Chinese Communist Party. They are being subsidized by the state.

lbreakjai2 days ago
When China does it it's communism. When companies in the west get massive tax cuts, rebates, incentives and subsidies, that's just supporting the captains of industry.
jari_mustonen2 days ago
Open Source as it gets in this space, top notch developer documentation, and prices insanely low, while delivering frontier model capabilities. So basically, this is from hackers to hackers. Loving it!

Also, note that there's zero CUDA dependency. It runs entirely on Huawei chips. In other words, Chinese ecosystem has delivered a complete AI stack. Like it or not, that's a big news. But what's there not to like when monopolies break down?

nabakin1 day ago
> Also, note that there's zero CUDA dependency. It runs entirely on Huawei chips.

That is a huge claim to make with no evidence.

I researched what you said, and I have found no statement to that effect in their paper[0], on huggingface[1], twitter[2], WeChat[3], or in their news release[4].

They only mention as a footnote in only the Chinese version of their news release that they plan to reduce inference costs with the Ascend 950 supernode when it releases[5]. The only mention of Huawei in their paper is that they validated a technique to lower interconnect bandwidth on Ascend NPUs and Nvidia GPUs[6].

[0] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

[1] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

[2] https://xcancel.com/deepseek_ai/status/2047516922263285776

[3] https://mp.weixin.qq.com/s/8bxXqS2R8Fx5-1TLDBiEDg

[4] https://api-docs.deepseek.com/news/news260424

[5] https://api-docs.deepseek.com/zh-cn/img/v4-price.png

[6] Page 16

glenstein1 day ago
Comments like this are why I go to the comments! I never would have thought to check.

And while I'm here I want to note that I feel there's a big misunderstanding of what is and isn't demonstrated by DeepSeek. So far as I can tell the major (and important!) innovation is reproducing near-frontier level capabilities at a fraction of the cost, but it may be the case that iterating forward at the frontier is the costly thing and is a cost borne by Western companies and that nuance seems to get lost with DeepSeek. Which is not to say that as a matter of principle that non Western companies aren't sometimes capable of jumping into the lead (Kimi has been super impressive) but if GPT/Claude/etc "only" lead at the frontier with more expensive models, that's still a moat.

kybernetikos1 day ago
If you can get something almost as capable for a fiftieth of the price, in most cases you'll do that. You might still send a few tokens to the more expensive option for the exceptional, difficult cases, but that's maybe 10% of the tokens at most. I don't see how it'll be possible to keep spending what anthropic, openai, google etc are spending if they're only going to see the trickiest 10% of tokens.
eckrabout 3 hours ago
I don't think this is private knowledge guessing from when and how I was told, so I feel comfortable sharing it. When I talked to some Huawei representatives, I was told DeepSeek V4 was trained entirely on Huawei chips. It's up to you whether you believe it or not, and while I see the incentives in faking these news, the blow if not true would be so massive that I don't think their representatives at large venues would be making these claims without thinking it's truly correct.
Scipio_Afri1 day ago
Thank you for this due diligence, I was just reading through the technical report and couldn’t find any references to the software stack or hardware mentioning Huawei either and came back here wondering about this comment that I had read earlier.
jari_mustonen1 day ago
Here's a note about running entirely on Huawei chips:

https://finance.yahoo.com/sectors/technology/articles/deepse...

tadfisher1 day ago
> DeepSeek indicated that current service capacity for the V4 Pro series is constrained by a computing crunch, though pricing could fall after new clusters powered by Huawei's Ascend 950 chips come online in the second half of the year.

Only mention of Huawei in that article (as of now).

selectodude1 day ago
Did you read any part of the link you posted? Huawei is mentioned once and not in the context of the model being trained or currently running on Huawei chips.
chvid1 day ago
Not long ago the story was this:

DeepSeek’s next AI model delayed by attempt to use Chinese chips

https://www.ft.com/content/eb984646-6320-4bfe-a78d-a1da2274b...

czk1 day ago
They mention it uses MXFP4 quant which is a blackwell capability but it looks like this is also supported by ascend 950 series according to marketing material
kappi1 day ago
DeepSeek is planning to use Huawei extensively for inference

“Due to constraints in high-end compute capacity, the current service capacity for Pro is very limited. After the 950 supernodes are launched at scale in the second half of this year, the price of Pro is expected to be reduced significantly.”

https://x.com/jukan05/status/2047516566149816627

nabakin1 day ago
Yes, that's the footnote from citation [5].
nsoonhui1 day ago
I said the same thing as you and I got summarily downvoted (https://news.ycombinator.com/item?id=47888227).

That HN is quick to upvote an unsubstantiated comment ( the grandparent one, because it aligns with the anti US bias? ) and downvote fact finding one doesn't bode too well for the community as a whole. I have seen enough how polticial ideology colors everything in my home country( Malaysia), and the decline of the country is palpable, and I don't expect to find such a thing here. We are supposed to be impassioned and rational, right ?

Render to Jesus what's due to him, ditto for Caeser.

nabakin1 day ago
Probably because you said you used DeepSeek. People don't want to see AI in the comments and don't trust AI responses.
dzonga2 days ago
Jensen Huang said this in his recent interview - that China has the best/most engineers, it has the chip making ability, it's a good thing they wanna build on a Nvidia stack - but if you push them they will build on an all Chinese stack - but the interviewer was being a numb head who kept parroting the propaganda of Western tech supremacy
zdragnar1 day ago
They would have moved to their own stack regardless. They've got the people and resources for it, and they've witnessed the fallout of globalization and experienced dependency on semi-hostile political powers enough to know that it's the smart move.

It's also more or less the same move that they've been using pretty much since the WTO entry: take on foreign manufacturing, copy the products, sell knockoffs as their own, build new products on top of the that knowledge.

arcticfox2 days ago
Referring to the Dwarkesh interview clearly.

Jensen came across as incredibly defensive and intentionally close-minded, shows that even billionaires suffer from "a man can't understand something if his paycheck depends on him not understanding it."

Your assertion is silly: did Tesla selling electric cars into China stop them from delivering their own industry? They were going to develop their domestic industry regardless.

We simply don't know the counterfactual, if they had unlimited access to Nvidia chips, how far ahead would their models be?

awongh2 days ago
I thought Jensen’s comparison to Huawei’s cell phone hardware infra (towers and networking) to be an interesting comparison- that shutting them out of a market was one of the causes of their current position in the market. It made them more dominant in the end.
ionelaipatioaei2 days ago
"close-minded" are the stupid people that unironically believe in the EA crap
solenoid09371 day ago
> but if you push them they will build on an all Chinese stack

That's alright. It delays them at least.

throwaw121 day ago
Sure, then hopefully makes them stronger
ifwinterco2 days ago
As a Brit I'm here for it to be honest, I'm tired of America with everything that's going on.

China is not perfect but a bit of competition is healthy and needed

chrsw2 days ago
I'm American. If the choice is between the current US direction or China, then no, I don't think the word "healthy" should be anywhere near this discussion.
eejjs2 days ago
I’m also a Brit and agree 100%.

We need to accept that being too close to America is harming us and start funding projects to protect our assets e.g talent leaking out to American entities.

nipponese2 days ago
It’s a shame your country couldn’t get back its technical edge.
falkenstein2 days ago
america is a continent. let’s take back our vocabulary (fellow european here). the little orange man shows very well what i mean when he started giving names to the gulf of mexico.
0xDEAFBEAD2 days ago
"In English, North America is its own continent as is South America. The two can be collectively labeled the Americas or the Western hemisphere. Canadians frequently refer to themselves as North Americans and never as Americans. To insist this change is to demand the entire world’s lingua franca redefine words and thereby cause mass confusion for its speakers simply because doing so would be consistent with an arbitrary definition found in a foreign language."

https://scrupulouspessimism.substack.com/p/america-means-the...

cg52801 day ago
It's also a country. Not sure what insisting we change our demonym accomplishes.
hsiudh2 days ago
"not perfect" is a _very_ big simplification of what China is though
rglullis2 days ago
Isn't that the same to every major superpower?
IsTom2 days ago
You can say the same about the US
hunter672 days ago
they compare it to fascist USA though
hart_russell1 day ago
Americans are also tired with what’s going on.
lifeisstillgood2 days ago
As a different Brit I do not accept such moral relativism.

China’s governments actions are on a completely different level - for example:

“””

Since 2014, the government of the People's Republic of China has committed a series of ongoing human rights abuses against Uyghurs and other Turkic Muslim minorities in Xinjiang which has often been characterized as persecution or as genocide.

“”” https://en.wikipedia.org/wiki/Persecution_of_Uyghurs_in_Chin...

https://www.amnesty.org/en/location/asia-and-the-pacific/eas...

Yes Trump is clearly trying Totalitarianism in America, but it is orders of magnitude different from what is happening in China.

amunozo2 days ago
Why do we ignore all the human right abuses the US perform abroad? Iraq, Afghanistan, now Iran, Gaza and Lebanon through Israel, support to Saudi Arabia (which would not exist without the US), El Salvador... And inside it's also horrible with its treatment to immigrant.

That should be at least comparable (if not worse) than what China is doing.

phatfish2 days ago
The US supports the genocide in Gaza, it supports the bombing of Lebanon. The US itself has now started (another) war and bombed Iran.

China is repressing the Uyghur and threatening Taiwan. I don't agree with these actions but is really "orders of magnitude" worse than the destruction the US facilitates in the Middle East?

With Trump they are now openly hostile to European democracies, and ICE and doing their best at repression within the US.

tw19842 days ago
It is just shocking to hear such stuff from someone in the UK.
cedws2 days ago
There’s little to no evidence of such “genocide”, but I can go on YouTube to watch videos of the US bombing civilians in the Middle East.
timmmk2 days ago
Fellow countryman here. I came here to say the same thing
jurgenburgen2 days ago
I don’t know if we’re ahead of the curve but that tired feeling has started turning into hate here in the EU. I guess being threatened with invasion does that to you.

The next decade is going to look very different with America Alone.

koe1232 days ago
I grew up in the states when I was younger, always feeling some closeness to Americans even after I moved back to Europe.

With all that goes on it has changed. Recently I sat on a plane near some Americans discussing their holidays here, and I noticed I felt contempt. Sitting their with insane privilege as their government torches the world.

Individuals remain individuals, and one really ought not to be prejudice. However the lack of resistance I see in in the “land of the free” as their “democratic” institutions collapse just makes me believe they never cared at all. In France cars are torched if the pension age is raised. In America the rise facism apparently doesnt matter to them.

ejpir1 day ago
yup. I was taught in school in Europe to admire Americans and their might. Only in the last few years I've come to understand they are maybe one of the worst western countries there is. Countless wars, even under Obama, so it's not a president x or y thing. It's culcture. I would go as far as to say I'd rather visit Russia than America at this point. America is great at hiding their true colors and we've been properly brainwashed in the West by this.
nailer2 days ago
As someone that lived in Britain for 15 years until 2024, I'm not sure a nation with a GDP per capita lower than Poland, that is now poorer than every state in America, with a gang rape epidemic the government tried to suppress investigating should really concern itself with how other countries are ran.
jbkkd1 day ago
> GDP per capita lower than Poland Not true. While Poland's GDP per capita has been on the rise , it is still nowhere near the UK's.

https://en.wikipedia.org/wiki/List_of_countries_by_GDP_%28no...

ifwinterco1 day ago
I'm not arguing that the UK is a particularly well run country, I just provided the context that I am British because it felt relevant
ogogmad2 days ago
> GDP per capita lower than Poland

> now poorer than every state in America

You've confused the mean with the median. GDP Per Capita is not a measure of how well-off the people in a country are.

American states have a lot more income inequality than the UK does, which (due to positive "non-parametric skewness", I think) pulls their GDP Per Capita upwards.

stronglikedan1 day ago
> I'm tired of America with everything that's going on.

Yeah, me too. All that pesky saving the world stuff that we do on the regular is so exhausting sometimes.

fn-mote1 day ago
“Saving the world” recently has meant being involved in wars in Afghanistan, Iraq, and now Iran.

None of those have brought me a feeling of being part of saving someone.

geonabout 18 hours ago
Who has been saved? The US has been doing much more harm than good.
jug1 day ago
Prices are also expected to drop significantly in H2 as they move to Huawei Ascend 950 super nodes.

Yes, even compared to this low price point.

As before, the headline news with DeepSeek isn't in the benchmarks, but that they're competitive there while being gut churningly cheap for the Western AI industry.

TrackerFF2 days ago
Let's see how long it takes before the big US AI companies start lobbying to outright ban use of Chinese AI, even the open source / local models. For "national security" reasons, of course.
chronc63932 days ago
> Let's see how long it takes before the big US AI companies start lobbying to outright ban use of Chinese AI, even the open source / local models. For "national security" reasons, of course.

Already do on EVs.

wookmaster2 days ago
This is already happening. My company just went through this
Scroll_Sweabout 23 hours ago
? Every company should lock down AI and only whitelist allowed tools.

We are "only" allowed Claude and MS Copilot for security reasons and cost reasons.

barnabee2 days ago
Hopefully the US’ self imposed isolation will mean that when they do, they aren’t able to force the rest of the world to follow suit.
zrn900about 20 hours ago
They already did - State Dept. launched global campaign against Deepseek.
resters1 day ago
Just looked into buying some Chinese GPUs and it turns out it's not easy or even legal! Big WTF moment.
OsrsNeedsf2P1 day ago
Same thing for Chinese EVs. America can't compete in the free market anymore
AlanYx1 day ago
In the US, yes, but Huawei has been gaining ground selling its SuperPod/Ascend turnkey solutions internationally, with some major recent wins in Thailand, Brazil, Egypt and Morocco.
khalic2 days ago
Open weight and open source are not the same
SquareWheel2 days ago
This is a pretty banal comment at this point. Open source is the term used in the LLM community. It's common and understood. Nobody is going to release petabytes of copyrighted training data, so the distinction between open source vs weights is a rather pointless one.
8note1 day ago
its still a pointed one.

"open source" keeps being redefined by people with wealth and power to restrict our computing rights.

eventually its just gonna be "proprietary microsoft code that runs on microsoft servers, but you can see a portion of the results"

khalic1 day ago
Tell this to the Allen project, Apertus Project, SmoLLM, etc, etc, etc
stefan_2 days ago
First you steal all the code, then you want to redefine the term? Is it never enough with you AI guys? Where's the humility, where's the good?
shiftingleft1 day ago
Is it really the full pipeline running on Huawei hardware? That is training and inference?

The report only talks about validating the "fine-grained EP scheme" on Huawei hardware.

digitaltrees1 day ago
I am all for monopoly breakdown. But there is an argument that this is anticompetitive strategy designed to undercut the commercial viability of the other labs. In free trade negotiations this is called “dumping”: selling a product below cost at a high volume to gain market share by driving competition out of the market and then raising prices when you’ve outlasted them.
otagekki1 day ago
Unfortunately Uber and the likes have been doing this but nobody batted an eye
digitaltrees1 day ago
I am a critic.
ibic2 days ago
"Open Source" is the ultimate romance understood by software engineers.
laurentiurad2 days ago
not a full AI stack. Training still runs on NVIDIA chips.
sudo_cowsay2 days ago
I sometimes wonder if there are any security risks with using Chinese LLMs. Is there?
dalemhurley2 days ago
Theoretically yes. It is entirely possible to poison the training data for a supply chain attack against vibe coders. The trick would be to make it extremely specific for a high value target so it is not picked up by a wide range of people. You could also target a specific open source project that is used by another widely used product.

However there is so many factors involved beyond your control that it would not be a viable option compared to other possible security attacks.

2ndorderthought2 days ago
I believe this is possible but unlikely. I don't think a Chinese company trying to break down the US's stronghold in this field would do this short term. I think it is in their best interest to be cheaper, better, easier, and more trust worthy until competition looks silly.

It's like suggesting BYD has a high likelihood of making their cars into weapons or something. It's not in the company or their countries interest to do that.

Sure it could happen but I bet it would only happen in a targeted way. Why risk all credibility right now and engage in cyber warfare?

mazurnification2 days ago
But propaganda or non ethical marketing - why not? (That is bias toward pointing to certain provider(s)).
wallst072 days ago
or more obvious like TikTok.

Meaning Tiktok in the us is complete garbage for kids, almost like a virus. Whereas in China it's more educational.

_blk2 days ago
Would be interesting to hook up a much simpler LLM as fact checker to see when errors are introduced.

If I had to place a hidden target it'd probably be around RNGs or publicly exposed services..

rapind2 days ago
All China (or anyone) has to do is deliver a close to equal product at a much cheaper price and make it scaleable / usable... which is what they're doing. It doesn't have to be malicious at all. Just a good product at a good price. The US is basically in a recession that's hiding behind insane AI investments.
oliwarner2 days ago
If there is, couldn't they exist in any model?

I don't mean that flippantly. These things are dumped in the wild, used on common (largely) open source execution chains. If you find a software exploit, it's going to affect your population too.

Wet exploits are a bit harder to track. I'd assume there are plenty of biases based on training material but who knows if these models have a MKUltra training programme integrated into them?

Hamuko2 days ago
There must be. The executives at my company wouldn't have banned them all for no reason after all.
cassianoleal2 days ago
What about LLMs from other origins? What makes them less risky?
rhubarbtree2 days ago
Backdooring software at scale.

Spearphishing.

Building reliance and exploiting it, through state subsidies, dumping, and market manipulation.

Handicapping provision to the west for competitive advantage.

2ndorderthought2 days ago
Do you think doing any of those things with in the next year does more to forward China as a super power then say, dethroning all of the US hype around LLMs?

Tech ceos are going around talking about how they will rule over employees and they will be unable to work in the future except for intelligence tokens. What if China commoditizes that without spending nearly as much resources? Kind of makes the trillions of dollars invested in the US a literal joke.

gmerc2 days ago
Anyone can do that via the scrapers. The model developers actually have something to lose tho
seniorThrowaway1 day ago
Are you implying only one country does these things?
surgical_fire2 days ago
I sometimes wonder is there are any security risks with using LLMs from the US.
eucyclos2 days ago
From my experience, kinda the opposite? It's like Chinese software is... Harder to weaponize or hurt yourself on. Deepseek is definitely censored, but I've never caught it being dishonest in a sneaky way.
SXX1 day ago
If you run local Deepseek, quant or distill its answer just fine on this prompt " What happened on 4 june 1989 on Tianamen Square?".

Even on my phone via Edge Gallery Deepseek to Qwen 1.5B distill able to answer it. It's mess up facts a little, but certainly becauae its small model not because censorship.

I really unsure how it get less censored than this. API is obviously much more censored because they operate from China, but it have nothing to do with model itself.

baal80spam2 days ago
Is this a serious comment? It honestly reads like the last famous words.

Of course there are risks.

accountofthaha2 days ago
Does the 'zero CUDA dependency' also count for running it on my own device? I have an AMD card, older model. Would love to have a small version of this running for coding purposes.

Really nice to see the Chinese are competing this strongly with the rest of the world. Competition is always nice for the end-consumer.

adrian_b2 days ago
The model is open weights, so you can download it from the link given at the top.

Then you can run it using some inference backend, e.g. llama.cpp, on any hardware supported by it.

However, this is a big model so even if you quantize it you need a lot of memory to be able to run it.

The alternative is to run it much more slowly, by storing the weights on an SSD. There have already been published some results about optimizing inference to work like this, and I expect that this will become more common in the future.

There are cases when running slowly a better model can still be preferable to running quickly a model that gives poor results, especially when you do not use it conversationally, but to do some work with agents.

d3Xt3r2 days ago
> Also, note that there's zero CUDA dependency.

So does this mean I can run this on AMD? And on a consumer 9000 series card?

HarHarVeryFunny2 days ago
If you don't have the source code then it makes no difference. If you have the weights and are running some model via llama.cpp, then you are using whatever API llama.cpp is using, not the API that was used to train the model or that anyone else may be using to serve it.
randomgermanguy2 days ago
If you found a rare 9000 card with 200+ GB of VRAM, sure
Eisenstein2 days ago
If the card supports vulkan and the model has gguf weights. llamacpp has excellent vulkan support that is being actively developed and is not that far behind CUDA where speed is concerned.

* https://github.com/ggml-org/llama.cpp/releases

frankdenbow2 days ago
Jensen was saying this in that interview last week and the interviewer dismissed it.
melenaboija2 days ago
The funniest thing is how Americans have been fooled with this stuff.

This version of AI is mostly taking a public paper from 2017, investing in GPUs, and feeding it as much data as possible. So with a few computer scientists, no respect for intellectual property, and tons of money to burn, you have all the ingredients to create this technology.

Sam Altman and friends did it, as did the Chinese. The difference is that the Americans have been hyping it up to the extreme with all these dramatic scenarios about what would happen if someone else got its hands on it.

The Chinese made it public, among other things to show how fragile this is as a business and as a large part of the US stock market

wookmaster2 days ago
The response from US corporations has been banning Chinese models claiming they’re spying or something.
shimman1 day ago
Yes, it's been widely well known that US corporations cannot compete fairly but require corporate welfare or US government to enforce military might over competitors.
pb71 day ago
>mostly taking a public paper from 2017

I love the implication that this paper just dropped out of thin air and not decades of private AI research funded by a US company.

>The Chinese made it public, among other things to show how fragile this is as a business

The Chinese distill US models, that's why they keep trailing close but never exceeding. It's easy to make things public when you didn't take on any of the cost of developing the technology. Stealing US IP and selling cheap copies has been China's MO for decades now.

wener2 days ago
As a Chinese, I feel tiered, it's like the cold war, what is takes to keep competitive with every aspect, it's just another win for the country and the corp
scronkfinkle2 days ago
> Also, note that there's zero CUDA dependency

Where did you read this? From what I read in the paper it appears to explicitly state that they used NVIDIA GPU's and their MegaMOE code, which is written in CUDA.

segmondy2 days ago
My guess is Chinese govt is going to mandate that labs switch all future training and inference to Huawei. DeepSeek has shown it's possible. Once they are done, the rest of the world is going to be buying Huawei! I for one can't wait for a cheap Huawei GPU!
kitd2 days ago
I can't find any info on what exactly is open sourced.

And in any case what does open source actually mean for an llm? It's not like you can look inside it to see what it's doing.

adrian_b2 days ago
The model is not "open source", but it is an open weights model.

You can download it from the link given here at the top and you can run it on your own hardware, with whichever open-source harness you prefer, without having to worry about token cost or about subscription limits or about any future degradation in performance that you cannot control.

The recent history has demonstrated that such risks are very significant.

Being open weights is important for anyone who wants to use an LLM. Being open source is important only for a subset of those, who have the will, the knowledge and the means to train a model from its training data.

Having access to the training data used by a model would be very nice, but the reality is that for a normal LLM user it is very beneficial to use an open-weights model with an open-source harness, but it would be much harder to exploit the advantage of having access to all the information about how the LLM has been created.

gommm2 days ago
For me open source means that the entire training data is open sourced as well as the code used for training it otherwise it's open weight. You can run it where you like but it's a black box. Nomic's models are good example of opensource.
adammarples2 days ago
Yes the weights are basically compiled code, compiled from the source data and the training code.
SkyBelow1 day ago
Even with all training data provided, won't it still be a black box? Unless one trains it exactly the same, in the exact same order for each piece of data, potentially requiring the exact same hardware with specific optimizations disabled due to race conditions, etc., the final weights will be different, and so knowing if the original weights actually contain anything extra still leaves any released weights as a black box, no? There isn't an equivalent of reproducible builds for LLM weights, even if all of this was provided, right?
verdverm2 days ago
Look up Olmo 3, where the have open weights, checkpoints, training data, and training process.

AllenAi is the fullest open ai I know of

nailer2 days ago
It's also not fake open source like Metas models - https://huggingface.co/deepseek-ai/DeepSeek-R1-0528, the weights are actually under a real open source license, (MIT), see https://huggingface.co/deepseek-ai/DeepSeek-R1-0528
slekker2 days ago
But remember to not ask about Taiwan!
tigrezno2 days ago
you talk like there isn't censorship in american AIs, like Israel topics.
unclejuan2 days ago
To be fair I prefer the Chinese models censorship (yes, seriously) because if you ask certain topics they just don't answer instead of giving skewed answers.
swingboy2 days ago
Ask a Chinese model about Taiwan, get denied. Ask an American model about Israel, get citizenship revoked and deported.
spaceman_20202 days ago
I can't wait for Taiwan to peacefully reunify with the mainland so the west with its constant war waging won't even have this talking point
wallst072 days ago
Are you Taiwanese? If not, your statement is a slap in the face to those citizens.
spiderfarmer2 days ago
Just ask it for a summary of the USA’s role in Iran, Gaza, Lebanon and its recent threats against Panama, Cuba and Greenland! It might be able to keep track.
libertine2 days ago
Are you implying that western models were manipulated to hide and distort those events, like they do with the Tiananmen Square event, and Taiwan?
teiferer2 days ago
Does all this insane behavior from the US justify the Chinese censorship?
eunos2 days ago
> China asks other country not to meddle with internal separatism > They also dont support separatism in my country

Understandable.

Lionga2 days ago
Quit a bit better then made to bomb little girl schools in Iran.
Markoff2 days ago
pretty sure you can ask whatever you want and it will tell you official stance agreed by almost all countries in the world that Taiwan is part of China as it's recognized by your own country (I don't even know where are you from, but there is like 98% chance I'm right)
nsoonhui2 days ago
Sorry, but exactly where did you get the idea that DS V4 runs entirely on Huawei?

I asked DS itself and it denied this. It says: 'Nvidia chips are absolutely used for DeepSeek V4. The reality is a pragmatic "both-and" strategy, not an "either-or."'

And based on the DS V4 technical report (https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...), it is mentioned that:

  We validated the fine-grained EP scheme on both NVIDIA GPUs and HUAWEI Ascend NPUs platforms. Compared against strong non-fused baselines, it achieves 1.50 ~ 1.73× speedup for general inference workloads, and up to 1.96× for latency-sensitive scenarios such as RL rollouts and high-speed agent serving.
(In all honesty I relied on DS to give me the above, so I haven't vetted the information in full.)

It mentions that Nvidia is still used. It doesn't even mention that Huawei chips are used in production — only in testing and validation, yes.

taytus2 days ago
>I asked DS itself and it denied this

Bro, seriously?

fblp2 days ago
There's something heartwarming about the developer docs being released before the flashy press release.
taurath2 days ago
Their audience is people who build stuff, techs audience is enterprise CEOs and politicians, and anyone else happy to hype up all the questionably timed releases and warnings of danger, white collar irrelevence, or promises of utopian paradise right before a funding round.
arrty882 days ago
now that we can use AI to write the docs , test the docs , proof read the docs , it really isn't that much of a feat right?
SV_BubbleTimeabout 18 hours ago
If those docs were written by Deepseek, it’s also a pretty positive review of the model.
onchainintel2 days ago
Insert obligatory "this is the way" Mando scene. Indeed!
necovek2 days ago
Where's the training data and training scripts since you are calling this open source?

Edit: it seems "open source" was edited out of the parent comment.

b65e8bee43c2ed02 days ago
doesn't it get tiring after a while? using the same (perceived) gotcha, over and over again, for three years now?

no one is ever going to release their training data because it contains every copyrighted work in existence. everyone, even the hecking-wholesome safety-first Anthropic, is using copyrighted data without permission to train their models. there you go.

necovek2 days ago
There is an easy fix already in widespread use: "open weights".

It is very much a valuable thing already, no need to taint it with wrong promise.

Though I disagree about being used if it was indeed open source: I might not do it inside my home lab today, but at least Qwen and DeepSeek would use and build on what eg. Facebook was doing with Llama, and they might be pushing the open weights model frontier forward faster.

Tepix2 days ago
Nvidia did with Nemo.
fragmede2 days ago
it's not a gotcha but people using words in ways others don't like.
bl4ckneon2 days ago
Aww yes, let me push a couple petabytes to my git repo for everyone to download...
necovek2 days ago
An easier thing would be to say "open weights", yes.
woctordho2 days ago
They are exactly open source. The training data is the internet. Don't say it's on the internet. It IS the internet.

The training scripts are in Megatron and vLLM.

0-_-02 days ago
Weights are the source, training data is the compiler.
injidup2 days ago
You got it the wrong way round. It's more akin to.

1. Training data is the source. 2. Training is compilation/compression. 3. Weights are the compiled source akin to optimized assembly.

However it's an imperfect analogy on so many levels. Nitpick away.

sho2 days ago
So, this is the version that's able to serve inference from Huawei chips, although it was still trained on nVidia. So unless I'm very much mistaken this is the biggest and best model yet served on (sort of) readily-available chinese-native tech. Performance and stability will be interesting to see; openrouter currently saying about 1.12s and 30tps, which isn't wonderful but it's day one after all.

For reference, the huawei Ascend 950 that this thing runs on is supposed to be roughly comparable to nVidia's H100 from 2022. In other words, things are hotting up in the GPU war!

alpineman2 days ago
Can't see how NVIDA justifies its valuation/forward P/E ratio with these developments and on-device also becoming viable for 98% of people's needs when it comes to AI
aurareturn2 days ago
On-device is incredibly far away from being viable. A $20 ChatGPT subscription beats the hell out of the 8B model that a $1,000 computer can run.

Nvidia's forward PE ratio is only 20 for 2026. That's much lower than companies like Walmart and Costco. It's also growing nearly 100% YoY and has a $1 trillion backlog.

I think Nvidia is cheap.

vibe422 days ago
I run both MoE and dense models on laptops.

One set of models run on 8GB VRAM / 16GB RAM and another set runs on 24GB VRAM / 64GB RAM. Both are very useful for easy and easy-to-moderate complex code, respectively.

The latest open, small models are incredibly useful even at smaller sizes when configured properly (quant size, sampling params, careful use of context etc).

2ndorderthought2 days ago
8b models can run on laptops. Of course a 1.8T model is more capable, but for a lot of tasks it really isn't 1000x
midwain2 days ago
This is an assessment of the moment. When rate of AI data center construction slows down, then P/E will start to grow. Or are we saying that the pace will only grow forever? There are already signs of a slowdown in construction.
littlestymaar2 days ago
> On-device is incredibly far away from being viable. A $20 ChatGPT subscription beats the hell out of the 8B model that a $1,000 computer can run.

That's a very strange comment. Why would anyone run a dense model on a low-end computer? A 8B model is only going to make sense if you have a dGPU. And a Qwen3.6 or Gemma4 MoE aren't going to be “beaten the hell out” for most tasks especially if you use tools.

Finally, over the lifetime of your computer, your ChatGPT subscription is going to cost more than the cost of your reference computer! So the real question should be whether you're better off with a $1000 computer and a ChatGPT subscription or with a $2000 computer (assuming a conservative lifetime of 4 years for the computer).

My Strix Halo desktop (which I paid ~1700€ before OpenAI derailed the RAM market) paired with Qwen3.5 is a close replacement for a $200/month subscription, so the cost/benefit ratio is strongly in favor of the local model in my use case.

The complexity of following model releases and installing things needed for self-hosting is a valid argument against local models, but it's absolutely not the same thing as saying that local models are too bad to use (which is complete BS).

alpineman2 days ago
I think you overestimate what most people are doing with AI. A 2B model can give out relationship advice and tell you how long to boil an egg.
dannyw2 days ago
I do think Nvidia isn't that badly priced; they still have the dominance in training and the proven execution

Biggest risk I see is Nvidia having delays / bad luck with R&D / meh generations for long enough to depress their growth projections; and then everything gets revalued.

npodbielski2 days ago
Great! Can't wait to buy decent GPU for interference for <1k$
gbnwl2 days ago
I’m deeply interested and invested in the field but I could really use a support group for people burnt out from trying to keep up with everything. I feel like we’ve already long since passed the point where we need AI to help us keep up with advancements in AI.
satvikpendem2 days ago
Don't keep up. Much like with news, you'll know when you need to know, because someone else will tell you first.
vessenes2 days ago
This is only good advice if you don’t have the need to understand what’s happening on the edge of the frontier. If you do, then you’ll lose on compounding the knowledge from staying engaged with the major developments.
satvikpendem2 days ago
Not all developments are equal. Many are experimental branches of testing things out that usually get merged back into the core, so to speak. For example, I knew someone who was full into building their own harness and implementing the Ralph loop and various other things, spending a lot of time on it and now, guess what? All of that is in Claude Code or another harness and I didn't have to spend any amount of time on it because ultimately they're implementation details.

It's like ricing your Linux distro, sure it's fun to spend that time but don't make the mistake of thinking it's productive, it's just another form of procrastination (or perhaps a hobby to put it more charitably).

roughly1 day ago
This one’s been particularly hard to sit out because the executive and managerial class are absolutely mainlining this stuff and pushing it hard on the rest of the organization, and so whether or not I want to keep up, I need to, because my job is to actually make stuff work and this stuff is a borderline existential risk to the quality of the systems I’m responsible for and rely on.
hnfong1 day ago
Thus, in the situation you described, "someone else will tell you first" is your boss.
wordpad2 days ago
The players barely ever change. People don't have problems following sports, you shouldn't struggle so much with this once you accept top spot changes.
gbnwl2 days ago
I didn't express this well but my interest isn't "who is in the top spot", and is more _why and _how various labs get the results they do. This is also magnified by the fact that I'm not only interested in hosted providers of inference but local models as well. What's your take on the best model to run for coding on 24GB of VRAM locally after the last few weeks of releases? Which harness do you prefer? What quants do you think are best? To use your sports metaphor it's more than following the national leagues but also following college and even high school leagues as well. And the real interest isn't even who's doing well but WHY, at each level.
yorwba2 days ago
The technical report discussing the why and how is here: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...
renticulous2 days ago
Follow the AI newsletters. They bundle the news along with their Op-Ed and summarize it better.
ehnto2 days ago
It is funny seeing people ping pong between Anthropic and ChatGPT, with similar rhetoric in both directions.

At this point I would just pick the one who's "ethics" and user experience you prefer. The difference in performance between these releases has had no impact on the meaningful work one can do with them, unless perhaps they are on the fringes in some domain.

Personally I am trying out the open models cloud hosted, since I am not interested in being rug pulled by the big two providers. They have come a long way, and for all the work I actually trust to an LLM they seem to be sufficient.

dannyw2 days ago
Their financial projections that to a big part their valuation and investor story is built on involves actually making money, and lots of money, at some point. That money has to come from somewhere.
DiscourseFan2 days ago
I find ChatGPT annoying mostly
notatoad1 day ago
I’m very satisfied with being three months behind everything in AI. That’s a level that’s useful, the overhyped nonsense gets found out before I need to care, and it’s easy enough to keep up with.
vrganj2 days ago
It honestly has all kinda felt like more of the same ever since maybe GPT4?

New model comes out, has some nice benchmarks, but the subjective experience of actually using it stays the same. Nothing's really blown my mind since.

Feels like the field has stagnated to a point where only the enthusiasts care.

ifwinterco2 days ago
For coding Opus 4.5 in q3 2025 was still the best model I've used.

Since then it's just been a cycle of the old model being progressively lobotomised and a "new" one coming out that if you're lucky might be as good as the OG Opus 4.5 for a couple of weeks.

Subjective but as far as I can tell no progress in almost a year, which is a lifetime in 2022-25 LLM timelines

_air1 day ago
Opus 4.5 was released on Nov 24 last year. It’s only been 5 months!
dannyw2 days ago
Another annoyance (for more API use) is summarized/hidden reasoning traces. It makes prompt debugging and optimization much harder, since you literally don't have much visibility into the real thinking process.
hnfong1 day ago
I don't trust the benchmarks either, so I maintained a set of benchmarks myself. I'm mostly interested in local models, and for the past 2 years they have steadily gotten better.

Can't argue with subjective experience, but if there were some tasks that you thought LLMs can't do two years ago, maybe try again today. You might be surprised.

trueno2 days ago
holy shit im right there with you
latentframeabout 1 hour ago
The 1.6T number is nice but also eye-catching and what matters most is how few parameters are active in practice, that’s what brings the most of the efficiency
Advertisement
maxloh1 day ago
They published model weights on Hugging Face. Both of them are MIT-licensed.

DeepSeek-V4-Flash: https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash

DeepSeek-V4-Pro: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

primaprashant2 days ago
While SWE-bench Verified is not a perfect benchmark for coding, AFAIK, this is the first open-weights model that has crossed the threshold of 80% score on this by scoring 80.6%.

Back in Nov 2025, Opus 4.5 (80.9%) was the first proprietary model to do so.

stared2 days ago
SWE-bench Verified is, at this point, contaminated https://openai.com/index/why-we-no-longer-evaluate-swe-bench...

So it os hard to tell how much of a model gain is due to skill, and how much - overfitting.

seanobannon2 days ago
BoorishBears2 days ago
yanis_t2 days ago
Already on Openrouter. Pro version is $1.74/m/input, $3.48/m/output, while flash $0.14/m/input, 0.28/m/output.
77ko2 days ago
Its on OR - but currently not available on their anthropic endpoint. OR if you read this, pls enable it there! I am using kimi-2.6 with Claude Code, works well, but Deepseek V4 gives an error:

`https://openrouter.ai/api/messages with model=deepseek/deepseek-v4-pro, OR returns an error because their Anthropic-compat translator doesn't cover V4 yet. The Claude CLI dutifully surfaces that error as "model...does not exist"

nl2 days ago
The Pro model is giving 429 Overload errors
XCSme2 days ago
Yup, can't really be used in production atm.
astrod2 days ago
Getting 'Api Error' here :( Every other model is working fine.
poglet2 days ago
Try interacting with it through the website, it will give an error and some explanation on the issue. I had to relax my guardrail settings.
vinhnx2 days ago
The king is back! I remember vividly being very amazed and having a deep appreciation reading DeepSeek's reasoning on Chat.DeepSeek.com, even before the DeepSeek moment in January later that year. I can't quite remember the date, but it's the most profound moment I have ever had. After OpenAI O1, no other model has “reasoning” capability yet. And DeepSeek opens the full trace for us. Seeing DeepSeek's “wait, aha…” moments is something hard to describe. I learned strategy and reasoning skills for myself also. I am always rooting for them.
buenolot2 days ago
Instead of King DeepSeek we got DeepShit Clown
mchusma2 days ago
For comparison on openrouter DeepSeek v4 Flash is slightly cheaper than Gemma 4 31b, more expensive than Gemma 4 26b, but it does support prompt caching, which means for some applications it will be the cheapest. Excited to see how it compares with Gemma 4.
MillionOClock2 days ago
I wonder why there aren't more open weights model with support for prompt caching on OpenRouter.
mzl2 days ago
It is tricky to build good infrastructure for prompt caching.
jatora2 days ago
Its as simple as telling your claude code to implement prompt caching!
sidcool2 days ago
Truly open source coming from China. This is heartwarming. I know if the potential ulterior motives.
b65e8bee43c2ed02 days ago
American companies want a scan of your asshole for the privilege of paying to access their models, and unapologetically admit to storing, analyzing, training on, and freely giving your data to any authorities if requested. Chinese ulteriority is hypothetical, American is blatant.
elefanten2 days ago
It’s not remotely hypothetical you’d have to be living under a rock to believe that. And the fusion with a one-party state government that doesn’t tolerate huge swathes of thoughtspace being freely discussed is completely streamlined, not mediated by any guardrails or accountability.

This “no harm to me” meme about a foreign totalitarian government (with plenty of incentive to run influence ops on foreigners) hoovering your data is just so mind-bogglingly naive.

ben_w2 days ago
As a non-American, everything you wrote other than "one party" applies to the current US regime.

Relatively speaking, DeepSeek is less untrustworthy than Grok.

When I try ChatGPT on current events from the White House it interprets them as strange hypotheticals rather than news, which is probably more a problem with DC than with GPT, but whatever.

oceanplexian2 days ago
> And the fusion with a one-party state government that doesn’t tolerate huge swathes of thoughtspace being freely discussed

That would be a great argument if the American models weren’t so heavily censored.

The Chinese model might dodge a question if I ask it about 1-2 specific Chinese cultural issues but then it also doesn’t moralize me at every turn because I asked it to use a piece of security software.

randomNumber72 days ago
The USA has one of the highest percentages of their population in prison.

Even for minor stuff like beeing addicted to drugs.

Looks pretty totalitarian to me.

theshackleford2 days ago
> This “no harm to me” meme about a foreign totalitarian government (with plenty of incentive to run influence ops on foreigners) hoovering your data is just so mind-bogglingly naive.

This is why I’ve been urging everyone I know to move away from American based services and providers. It’s slow but honest work.

b65e8bee43c2ed02 days ago
>This “no harm to me” meme about a foreign totalitarian government (with plenty of incentive to run influence ops on foreigners) hoovering your data is just so mind-bogglingly naive.

yes, this is exactly what I'm saying.

danny_codes2 days ago
It’s an open model? So you can run it yourself if you want to
casey22 days ago
Thousands of years with no invasions, hundreds of years with thousands of invasions.

China is a nation built for peace, while western nations are built for war.

michaelt2 days ago
The oppression of people in China like Uyghurs and Hong Kong, the complete lack of free speech, the saber-rattling at neighbours, and the lack of respect for intellectual property are indeed all well documented.

But for folks on the opposite side of the world, the threats are more like "they're selling us electric cars and solar panels too cheaply" and the hypothetical "these super cheap CCTV cameras could be used for remote spying"

t0lo2 days ago
And you're saying Americans aren't banned from criticising their elites?
mwigdahl2 days ago
I, personally, have never been asked for an asshole scan, but I'm interested in providing one if you can point me to a company that's offering.
simplesocieties1 day ago
It's clear the OC was using hyperbole but we're honestly not too far off. Just a few examples:

- Sam Altman & Worldcoin collecting everyone's eyeball scan - Discord attempting to roll out worldwide age & id verification - LinkedIn collecting data on your web browser extensions - WhatsApp collecting browser data via a local server running on device

93po1 day ago
GoatseAI - the type of open that OpenAI should have been from the start
thesmtsolver22 days ago
As someone with Tibetan friends and as someone from India, Chinese ulterior motives are way more clear.
mordae2 days ago
Same as USA. Happy to see some competition.
Quothling2 days ago
It's a little sad that tech now comes down to geopolitics, but if you're not in the USA then what is the difference? I'm Danish, would I rather give my data to China or to a country which recently threatened the kingdom I live in with military invasion? Ideally I'd give them to Mistral, but in reality we're probably going to continue building multi-model tools to make sure we share our data with everyone equally.
jatora2 days ago
Lol EU pats you on the head

Its sad to see how you have regulated yourselves into a position where Mistral is your only claim.

spaceman_20202 days ago
I don’t care about whatever “ulterior motives” they might have

My country’s per capita income is $2500 a year. We can’t pay perpetual rent to OAI/Anthropic

djyde2 days ago
Same
try-working2 days ago
if you want to understand why labs open source their models: http://try.works/why-chinese-ai-labs-went-open-and-will-rema...
wraptile2 days ago
> Internet comments say that open sourcing is a national strategy, a loss maker subsidized by the government. On the contrary, it is a commercial strategy and the best strategy available in this industry.

This sounds whole lot like potatoh potahto. I think the former argument is very much the correct one: China can undercut everyone and win, even at a loss. Happened with solar panels, steel, evs, sea food - it's a well tested strategy and it works really well despite the many flavors it comes in.

That being said a job well done for the wrong reasons is still a job well done so we should very much welcome these contributions, and maybe it's good to upset western big tech a bit so it's remains competitive.

try-working2 days ago
It is not only that Chinese labs can undercut on price. It is that they must. They must give away their models for free by open sourcing them, and they must even give away free inference services for people to try them. That is the point of the post.
I_am_tiberius2 days ago
Open weight!
alecco2 days ago
Please don't slander the most open AI company in the world. Even more open than some non-profit labs from universities. DeepSeek is famous for publishing everything. They might take a bit to publish source code but it's almost always there. And their papers are extremely pro-social to help the broader open AI community. This is why they struggle getting funded because investors hate openness. And in China they struggle against the political and hiring power of the big tech companies.

Just this week they published a serious foundational library for LLMs https://github.com/deepseek-ai/TileKernels

Others worth mentioning:

https://github.com/deepseek-ai/DeepGEMM a competitive foundational library

https://github.com/deepseek-ai/Engram

https://github.com/deepseek-ai/DeepSeek-V3

https://github.com/deepseek-ai/DeepSeek-R1

https://github.com/deepseek-ai/DeepSeek-OCR-2

They have 33 repos and counting: https://github.com/orgs/deepseek-ai/repositories?type=all

And DeepSeek often has very cool new approaches to AI copied by the rest. Many others copied their tech. And some of those have 10x or 100x the GPU training budget and that's their moat to stay competitive.

The models from Chinese Big Tech and some of the small ones are open weights only. (and allegedly benchmaxxed) (see https://xcancel.com/N8Programs/status/2044408755790508113). Not the same.

patshead2 days ago
DeepSeek's models are indeed open weight. Why do you feel that pointing this out would be considered slander?
kortilla2 days ago
It’s not slander to say something true. These are open weights, not open source. They don’t provide the training data or the methodology requires to reproduce these weights.

So you can’t see what facts are pruned out, what biases were applied, etc. Even more importantly, you can’t make a slightly improved version.

This model is as open source as a windows XP installation ISO.

0-_-02 days ago
Weights are the source, training data is the compiler
crazylogger2 days ago
Training data == source code, training algorithm == compiler, model weights == compiled binary.
ngruhn2 days ago
isn't it more like the data is the source, the training process is the compiler, and the weights are the binary output.
zerr2 days ago
Do they also open-source censoring filter rules? Like, you can't ask what happened at Tiananmen Square in 1989.
harladsinsteden2 days ago
> I know if the potential ulterior motives.

And you think the US tech giants don't have any ulterior motives?!

FuckButtons2 days ago
I think their motives are pretty transparent, as are china’s, as ever, you have to pick the lesser of two evils.
neonstatic1 day ago
How are the "ulterior motives" of Chinese companies any worse than "ulterior motives" of US companies or European ones?
yanis_t2 days ago
Assuming it is almost as good as Opus 4.6 (which benchmarks seem to give evidence for), and assuming we are having a good enough harness (PI, OpenCode), it's is now more than 5x cheaper.

I just want to remind you that this is happening at the same time as Anthropic A/B tests removal of Code from Pro Plan, and as OpenAI releases gpt-5.5 2x more expensive than gpt-5.4...

stingraycharles2 days ago
> Assuming it is almost as good as Opus 4.6 (which benchmarks seem to give evidence for)

That’s a big if. It’s my experience that models that perform very well on benchmarks do not necessarily perform well in real life.

I’ve mostly started ignoring the benchmarks and run my own evals.

ting01 day ago
> It’s my experience that models that perform very well on benchmarks do not necessarily perform well in real life

Well, yeah... Like Opus 4.5, 4.6, 4.7. Top of the benchmarks and yet it's a pile of crap at the moment and has been for months.

jatora2 days ago
If benchmarks are all to be believed then gemini 3.1 and grok 4.2 are still in the lead pack. A laughable notion to anyone who has actually tried to use them and compared.
LZ_Khan1 day ago
It's easy to praise Deepseek for its results and generosity -- how they can keep up with frontier labs on Huawei chips for a fraction of the cost! -- but let's not forget a big part of their toolkit is heavy distillation of SoTA.
copypaper1 day ago
Let's also not forget SoTA models stole from us.
gordonhart1 day ago
True, and they're being tried in a federal court of law for it. NYT v. OpenAI is still very much alive, these things just take a while. Can the same be said about DeepSeek or any other open-source model provider performing distillation?
copypaper1 day ago
Pandora's box has already been opened and there is no going back. I doubt OpenAI, et al will get anything but a slap on the wrist in court because punishing AI companies would have a negative effect on the US economy.

>Can the same be said about DeepSeek or any other open-source model provider performing distillation?

Open source models that distill from SoTA reminds me of the story of Robin Hood -- robbing the rich and giving it to the poor. So to answer your question: yes, but it's better than the alternative where only a select few companies have SoTA models.

riskd1 day ago
You already know what the results of this “trial” will be. Let’s not pretend.
paweladamczuk1 day ago
>these thing just take a while

Oh, so people might be forced to give back the AI earnings? Should I be worried about the last year's capital gains on my portfolio?

vatsachak1 day ago
Literally.

Altman and Amodei are so mad about muhh model when they steal our data and pollute the Internet with slop.

93po1 day ago
let's not forget that calling copyright infringement theft is hyperbole, and the claim that AI is even infringing is also dubious at best, and that the concept of intellectual property at all is also ethically dubious
MiSeRyDeee1 day ago
So they distill the sota model where OAI/Anthropic illegally stole from public, and open weights to us or sell their API at 1/50th of the price? I'd say keep up the good work and distill more!
hamdingers1 day ago
I could not possibly care less if I tried. Every LLM is a distillation of something else.
seydorabout 11 hours ago
All AI software is built on open source. They are just giving back what they should
orbital-decay1 day ago
What's the evidence?
slopinthebag1 day ago
Who cares? Also Anthropic does the same thing - if you ask it who it is in Chinese it says it's DeepSeek LOL

https://x.com/teortaxesTex/status/2026130112685416881

amunozo2 days ago
For those who rely on open source models but don't want to stop using frontier models, how do you manage it? Do you pay any of the Chinese subscription plans? Do you pay the API directly? After GPT 5.5 release, however good it is, I am a bit tired of this price hiking and reduced quota every week. I am now unemployed and cannot afford more expensive plans for the moment.
azuanrb2 days ago
I have $20 ChatGPT subscription. Stopped Anthropic $20 subscription since the limit ran out too fast. That's my frontier model(s).

For OSS model, I have z.ai yearly subscription during the promo. But it's a lot more expensive now. The model is good imo, and just need to find the right providers. There are a lot of alternatives now. Like I saw some good reviews regarding ollama cloud.

amunozo2 days ago
I am thinking about getting some 1 year promotion as a student before defending my PhD.
regularfry2 days ago
I've been on Kimi K2.5 on openrouter for a couple of months for anything I can't run locally. Really is dirt cheap for how good it is. Haven't assessed K2.6 yet but the price is higher so it needs to be more efficient, not just more capable.

But more broadly: openrouter solves the problem of making a broad range of models available with a single payment endpoint, so you can just switch around as much as you like.

eleventenabout 19 hours ago
How do you find the token speed of open router with kimi?

I have tasks that used to take ~3-5min with Sonnet 4.6. With OpenRouter Kimi, the same task takes 10+ min. It's also just obviously slower in opencode sessions. The results are good, and I love the lower cost, but the speed can be frustrating.

the_gipsy2 days ago
Have you considered... not subscribing? You can ask the top models via chats for specific stuff, and then set up some free CLI like mistral.

If you're trying to make a buck while unemployed, sure get a subscription. Otherwise learn how to work again without AI, just focus on the interesting stuff.

amunozo2 days ago
I just want to try to make something useful out of my time, that's why I'm subscribed to Codex at the moment. 20€ is affordable, not really a problem. But yes, maybe I would do me a favor unsubscribing and going back to the old ways to learn properly.
the_gipsy2 days ago
I'm "working" on some open source stuff with minimal AI. But I will probably cave in at some point and get a subscription again, the moment I need to spin up a mountain of garbage, fast.
solarkraft2 days ago
At home I currently use MiniMax via OpenRouter - it’s pretty good and very cheap. They have a subscription plan, but I’m not ready to commit to it yet.

Another way to keep the ability to try out new models is to buy a reseller subscription like Cursor’s.

amunozo2 days ago
I tried OpenRouter but I feel the money flies even with these models, it is not comparable to a subscription but yes, it's very good for trying. Maybe I should test other models alongside GPT 5.5 to see which one fits me.
elbear2 days ago
I'm also unemployed. So far the models that I've used the most are Kimi and GLM. I haven't done that much agentic coding though, I've mostly used them for studying math and general conversations and I'm generally happy with their performance.
never_inline1 day ago
Gemini has a free tier for API but yeah just use chat.
cmrdporcupine2 days ago
For DeepSeek you can use their API and if you ran it constantly you'd still be under what OpenAI or Anthropic charge for a coding plan.
anentropic2 days ago
I had Claude make me a quick tool to combine my Claude Code token usage (via ccusage util) with OpenRouter pricing from the models API

I'm on Max x5 plan and any of the 'good' models like Kimi 2.6, GLM, DeepSeek would have cost 3-5x in per-token billing for what I used on my Claude plan the last three months

So unless my Claude fudged the maths to make itself look better, seems like I'm getting a good deal

amunozo2 days ago
I am not so sure, credits fly when using any model trough API if I use it as much as I use Codex.
Advertisement
zargon2 days ago
The Flash version is 284B A13B in mixed FP8 / FP4 and the full native precision weights total approximately 154 GB. KV cache is said to take 10% as much space as V3. This looks very accessible for people running "large" local models. It's a nice follow up to the Gemma 4 and Qwen3.5 small local models.
sbinnee2 days ago
Price is appealing to me. I have been using gemini 3 flash mainly for chat. I may give it a try.

input: $0.14/$0.28 (whereas gemini $0.5/$3)

Does anyone know why output prices have such a big gap?

girvo2 days ago
Output is what the compute is used for above all else; costs more hardware time basically than prompt processing (input) which is a lot faster
tokenmaxxinej2 days ago
input tokens are processed at 10-50 times the speed of output tokens since you can process then in batches and not one at a time like output tokens
regularfry2 days ago
I'm going to blow my bandwidth allowance again this month, aren't I.
nthypes2 days ago
https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

Model was released and it's amazing. Frontier level (better than Opus 4.6) at a fraction of the cost.

0xbadcafebee2 days ago
I don't think we need to compare models to Opus anymore. Opus users don't care about other models, as they're convinced Opus will be better forever. And non-Opus users don't want the expense, lock-in or limits.

As a non-Opus user, I'll continue to use the cheapest fastest models that get my job done, which (for me anyway) is still MiniMax M2.5. I occasionally try a newer, more expensive model, and I get the same results. I have a feeling we might all be getting swindled by the whole AI industry with benchmarks that just make it look like everything's improving.

versteegen2 days ago
Which model's best depends on how you use it. There's a huge difference in behaviour between Claude and GPT and other models which makes some poor substitutes for others in certain use cases. I think the GPT models are a bad substitute for Claude ones for tasks such as pair-programming (where you want to see the CoT and have immediate responses) and writing code that you actually want to read and edit yourself, as opposed to just letting GPT run in the background to produce working code that you won't inspect. Yes, GPT 5.4 is cheap and brilliant but very black-box and often very slow IME. GPT-5.4 still seems to behave the same as 5.1, which includes problems like: doesn't show useful thoughts, can think for half an hour, says "Preparing the patch now" then thinks for another 20 min, gives no impression of what it's doing, reads microscopic parts of source files and misses context, will do anything to pass the tests including patching libraries...
ind-igo2 days ago
Agree with your assessment, I think after models reached around Opus 4.5 level, its been almost indistinguishable for most tasks. Intelligence has been commoditized, what's important now is the workflows, prompting, and context management. And that is unique to each model.
vidarh2 days ago
Same for me. There are tasks when I want the smartest model. But for a whole lot of tasks I now default to Sonnet, or go with cheaper models like GLM, Kimi, Qwen. DeepSeek hasn't been in the mix for a while because their previous model had started lagging, but will definitely test this one again.

The tricky part is that the "number of tokens to good result" does absolutely vary, and you need a decent harness to make it work without too much manual intervention, so figuring out which model is most cost-effective for which tasks is becoming increasingly hard, but several are cost-effective enough.

wuschel2 days ago
This is not true for some cases e.g. there are stark differences in the correctness of answers in certain type of case work.
spaceman_20202 days ago
I found Opus 4.7 to be actually worse than Opus 4.6 for my use case

Substantially worse at following instructions and overoptimized for maximizing token usage

sandos2 days ago
Is Opus nerfed somehow in Copilot? Ive tried it numerous times, it has never reallt woved me. They seem to have awfully small context windows, but still. Its mostly their reasoning which has been off

Codex is just so much better, or the genera GPT models.

specproc2 days ago
Opus just got killed in Copilot. I always found it great, FWIW.

https://github.blog/news-insights/company-news/changes-to-gi...

kmarc2 days ago
This resonates with me a lot.

I do some stuff with gemini flash and Aider, but mostly because I want to avoid locking myself into a walled garden of models, UIs and company

post-it2 days ago
What do you run these on? I've gotten comfortable with Claude but if folks are getting Opus performance for cheaper I'll switch.
slopinthebag2 days ago
Try Charm Crush first, it's a native binary. If it's unbearable, try opencode, just with the knowledge your system will probably be pwned soon since it's JS + NPM + vibe coding + some of the most insufferable devs in the industry behind that product.

If you're feeling frisky, Zed has a decent agent harness and a very good editor.

oceanplexian2 days ago
You can just use Claude Code with a few env vars, most of these providers offer an Anthropic compatible API
avereveard2 days ago
eh idk. until yesterday opus was the one that got spatial reasoning right (had to do some head pose stuff, neither glm 5.1 nor codex 5.3 could "get" it) and codex 5.3 was my champion at making UX work.

So while I agree mixed model is the way to go, opus is still my workhorse.

gunalx2 days ago
I find gemini pretty good ob spatial reasoning.
szundi2 days ago
I don’t know what people are doing but Minimax produced 16 bugreports which of 15 was false positives (literally a mistake).

In contrast ChatGPT 5.3 and also Opus has a 90% rate at least on this same project. (Embedded)

All other tests were the same. What are you doing with these models?

sandGorgon2 days ago
actually this is not the reason - the harness is significantly better. There is no comparable harness to Claude Code with skills, etc.

Opencode was getting there, but it seems the founders lost interest. Pi could be it, but its very focused on OpenClaw. Even Codex cli doesnt have all of it.

which harness works well with Deepseek v4 ?

darkwater2 days ago
What's the issue with OC? I tried it a bit over 2 months ago, when I was still on Claude API, and it actually liked more that CC (i.e. the right sidebar with the plan and a tendency at asking less "security" questions that CC). Why is it so bad nowadays?
onchainintel2 days ago
How does it compare to Opus 4.7? I've been immersed in 4.7 all week participating in the Anthropic Opus 4.7 hackathon and it's pretty impressive even if it's ravenous from a token perspective compared to 4.6
greenknight2 days ago
The thing is, it doesnt need to beat 4.7. it just needs to do somewhat well against it.

This is free... as in you can download it, run it on your systems and finetune it to be the way you want it to be.

libraryofbabel2 days ago
> you can download it, run it on your systems

In theory, sure, but as other have pointed out you need to spend half a million on GPUs just to get enough VRAM to fit a single instance of the model. And you’d better make sure your use case makes full 24/7 use of all that rapidly-depreciating hardware you just spent all your money on, otherwise your actual cost per token will be much higher than you think.

In practice you will get better value from just buying tokens from a third party whose business is hosting open weight models as efficiently as possible and who make full use of their hardware. Even with the small margin they charge on top you will still come out ahead.

p1esk2 days ago
Do you think a lot of people have “systems” to run a 1.6T model?
onchainintel2 days ago
Completely agree, not suggesting it needs ot just genuinely curious. Love that it can be run locally though. Open source LLMs punching back pretty hard against proprietary ones in the cloud lately in terms of performance.
kelseyfrog2 days ago
What's the hardware cost to running it?
johnmaguire2 days ago
... if you have 800 GB of VRAM free.
rvz2 days ago
It is more than good enough and has effectively caught up with Opus 4.6 and GPT 5.4 according to the benchmarks.

It's about 2 months behind GPT 5.5 and Opus 4.7.

As long as it is cheap to run for the hosting providers and it is frontier level, it is a very competitive model and impressive against the others. I give it 2 years maximum for consumer hardware to run models that are 500B - 800B quantized on their machines.

It should be obvious now why Anthropic really doesn't want you to run local models on your machine.

deaux2 days ago
Vibes > Benchmarks. And it's all so task-specific. Gemini 3 has scored very well in benchmarks for very long but is poor at agentic usecases. A lot of people prefering Opus 4.6 to 4.7 for coding despite benchmarks, much more than I've seen before (4.5->4.6, 4->4.5).

Doesn't mean Deepseek v4 isn't great, just benchmarks alone aren't enough to tell.

snovv_crash2 days ago
With the ability of the Qwen3.6 27B, I think in 2 years consumers will be running models of this capability on current hardware.
colordrops2 days ago
What's going to change in 2 years that would allow users to run 500B-800B parameter models on consumer hardware?
spaceman_20202 days ago
Tbh I was more productive with 4.6 than ever before and if AI progress locks in permanently at 4.6 tier, I’d be pretty happy
bbor2 days ago
For the curious, I did some napkin math on their posted benchmarks and it racks up 20.1 percentage point difference across the 20 metrics where both were scored, for an average improvement of about 2% (non-pp). I really can't decide if that's mind blowing or boring?

Claude4.6 was almost 10pp better at at answering questions from long contexts ("corpuses" in CorpusQA and "multiround conversations" in MRCR), while DSv4 was a staggering 14pp better at one math challenge (IMOAnswerBench) and 12pp better at basic Q&A (SimpleQA-Verified).

Quasimarion2 days ago
FWIW it's also like 10x cheaper.
creamyhorror2 days ago
No, the Deepseek V4 paper itself says that DS-V4-Pro-Max is close to Opus 4.5 in their staff evaluations, not better than 4.6:

> In our internal evaluation, DeepSeek-V4-Pro-Max outperforms Claude Sonnet 4.5 and approaches the level of Opus 4.5.

doctoboggan2 days ago
Is it honestly better than Opus 4.6 or just benchmaxxed? Have you done any coding with an agent harness using it?

If its coding abilities are better than Claude Code with Opus 4.6 then I will definitely be switching to this model.

bokkies2 days ago
Apparently glm5.1 and qwen coder latest is as good as opus 4.6 on benchmarks. So I tried both seriously for a week (glm Pro using CC) and qwen using qwen companion. Thought I could save $80 a month. Unfortunately after 2 days I had switched back to Max. The speed (slower on both although qwen is much faster) and errors (stupid layout mistakes, inserting 2 footers then refusing to remove one, not seeing obvious problems in screenshots & major f-ups of functionality), not being able to view URLs properly, etc. I'll give deepseek a go but I suspect it will be similar. The model is only half the story. Also been testing gpt5.4 with codex and it is very almost as good as CC... better on long running tasks running in background. Not keen on ChatGPT codex 'personality' so will stick to CC for the most part.
madagang2 days ago
Their Chinese announcement says that, based on internal employee testing, it is not as good as Opus 4.6 Thinking, but is slightly better than Opus 4.6 without Thinking enabled.
mchusma2 days ago
I appreciate this, makes me trust it more than benchmarks.
ibic2 days ago
In case people wonder where the announcement is (you can easily translate it via browser if you don't read Chinese): https://mp.weixin.qq.com/s/8bxXqS2R8Fx5-1TLDBiEDg

It's still a "preview" version atm.

deaux2 days ago
That's super interesting, isn't Deepseek in China banned from using Anthropic models? Yet here they're comparing it in terms of internal employee testing.
anentropic2 days ago
Who uses Opus without thinking though...?
NitpickLawyer2 days ago
> (better than Opus 4.6)

There we go again :) It seems we have a release each day claiming that. What's weird is that even deepseek doesn't claim it's better than opus w/ thinking. No idea why you'd say that but anyway.

Dsv3 was a good model. Not benchmaxxed at all, it was pretty stable where it was. Did well on tasks that were ood for benchmarks, even if it was behind SotA.

This seems to be similar. Behind SotA, but not by much, and at a much lower price. The big one is being served (by ds themselves now, more providers will come and we'll see the median price) at 1.74$ in / 3.48$ out / 0.14$ cache. Really cheap for what it offers.

The small one is at 0.14$ in / 0.28$ out / 0.028$ cache, which is pretty much "too cheap to matter". This will be what people can run realistically "at home", and should be a contender for things like haiku/gemini-flash, if it can deliver at those levels.

slopinthebag2 days ago
Anthropic fans would claim God itself is behind Opus by 3-6 months and then willingly be abused by Boris and one of his gaslighting tweets.

LMAO

NitpickLawyer2 days ago
> Anthropic fans ...

I have no idea why you'd think that, but this is straight from their announcement here (https://mp.weixin.qq.com/s/8bxXqS2R8Fx5-1TLDBiEDg):

> According to evaluation feedback, its user experience is better than Sonnet 4.5, and its delivery quality is close to Opus 4.6's non-thinking mode, but there is still a certain gap compared to Opus 4.6's thinking mode.

This is the model creators saying it, not me.

sergiotapia2 days ago
The dragon awakes yet again!
kindkang20242 days ago
There appears a flight of dragons without heads. Good fortune.

That's literally what the I Ching calls "good fortune."

Competition, when no single dragon monopolizes the sky, brings fortune for all.

rapind2 days ago
Pop?
chvid2 days ago
The incredible arrogance and hybris of the American initiated tech war - it is just a beautiful thing to see it slowly fall apart.

The US-China contest aside - it is in the application layer llms will show their value. There the field, with llm commoditization and no clear monopolies, is wide open.

There was a point in time where it looked like llms would the domain of a single well guarded monopoly - that would have been a very dark world. Luckily we are not there now and there is plenty of grounds for optimism.

sigmoid102 days ago
Still not sure how I feel about China of all places to control the only alternative AI stack, but I guess it's better than leaving everything to the US alone. If China ever feels emboldened enough to go for Taiwan and the US descends into complete chaos, the rest of the world running on AI will be at the mercy of authoritarian regimes. At the very least you can be sure noone is in this for the good of the people anymore. This is about who will dominate the world of tomorrow. And China has officially thrown their hat in the ring.
Ladioss2 days ago
I always find it an illuminating experience about the power of mass propaganda every time I see an American believe they somewhat have the moral high ground over China, despite starting a new war somewhere around the globe either for petrol or on behalf of Israel every six months.
rfrey2 days ago
Many of us (worldwide, I'm not American) watched China massacre thousands of its own children at Tiananmen Square. The US is descending into totalitarianism, but it hasn't reached that level yet.

And China may have changed in some ways but there have been no signals it would not repeat that event if it thought circumstances warranted.

kiba2 days ago
Just because America is doing bad things doesn't mean China is good, or vice versa.
stickfigure2 days ago
The difference is that - at least in the last 50 years - the US starts wars with brutal dictatorships. Whereas China is threatening war against a thriving democracy.

These are not equivalent.

vsgherzi2 days ago
Not about moral high ground. Ones a democracy one isn’t.
scyclow1 day ago
As an American, I can conclusively say that we absolutely have no moral high ground whatsoever. But bringing the topic back to LLMs, I don't feel great about using an LLM that has a panic attack any time I ask about Tiananmen Square or Taiwanese sovereignty.
foobiekr1 day ago
There’s no organ harvesting of a religious subgroup in America.
wiseowise2 days ago
What makes you think they’re American?
srj1 day ago
On the contrary, I find reading your own confused spin on morality here an interesting window into the effectiveness of propaganda. You're taking two oppressive authoritarian governments and elevating them above the US.
nipponese2 days ago
The U.S. is not the country conducting amoral behavior with terrorist regimes for oil, that’s China.

We conduct amoral behavior with terrorist regimes for dollars.

Scroll_Swe1 day ago
I always find the China glazing online getting worse and worse.

TikTok and Hasan has really turned the West against itself.

philipallstar2 days ago
China having killed up to 50m of its own population in the 20th century through socialism, while America led the world in funding NATO, global scientific research, and global aid for decades buys America a lot of good grace.
glenstein1 day ago
And by contrast what I find stunning is the inability to engage in meaningful comparative analysis of relative harms. There's a lot of spectacularly insightful attention to detail in so far as it mobilizes what aboutism arguments and then that attention mysteriously falls away when we ask questions like the extent to which these sides allow free press or democratic elections with multiple parties or permit fair trials. You used to not have to explain these things.
mrkramer2 days ago
All empires are to some degree evil because their agenda is to dominate weaker peoples and nations. They almost all committed crimes against humanity and genocides if you look retrospectively from the todays point of view. Even our beloved Roman Empire that the Western civilization is built upon was genocidal empire.
Rover2221 day ago
Chinese citizens will go to jail if they are too critical of their own government. How hard is it for you to wrap your head around those implications?
SmirkingRevenge1 day ago
The moral high ground claims here can be generalized:

Liberal democracies have moral high ground over authoritarian dictatorships (at least along that one dimension)

The US is backsliding tragically (and stupidly) and may lose that moral high ground, but the rest of the western democracies will still have it

nailer2 days ago
> I see an American believe they somewhat have the moral high ground over China

The elected government of the US has the moral highground of over the regime that killed the KMT in it's weakened state after the KMT defeated Japan, went on a rampage against the educated classes, mowed down its own people with machineguns and tanks when they demanded a say in their own governments, and kidnaps people advocating for democracy to this day, including Jack Ma.

> despite starting a new war... on behalf of Israel every six months.

The war started when Hamas, funded by Iran, went on a murder and rape rampage against Israeli civilians.

OCASMv22 days ago
The Uyghur say hi.
hersko2 days ago
Talks about "mass propaganda."

Thinks America is starting wars on behalf of Israel.

LMAO

Der_Einzige2 days ago
One province of China has enough hellish nightmarish bullshit going on caused by the CCP that we maintain total moral superiority over them. It’s not even a question to anyone except “fellow travelers”.
annexrichmond1 day ago
So you think the US should sit back and watch Iran develop nukes? Is that the “moral” thing to do?
chmod7752 days ago
> Still not sure how I feel about China of all places to control the only alternative AI stack, but I guess it's better than leaving everything to the US alone.

Fully agree. From a US perspective, that sucks. For everyone else it's pretty great.

At this point the world's opinions of China are better than those of the US in some polls. One country invests and helps build infrastructure on a massive scale globally, the other alienates allies, causes countless conflicts, and openly threatens to end civilizations.

Indeed, even if one isn't partial to China, there's reasons to be glad that an increasingly hostile US has powerful competition.

> This is about who will dominate the world of tomorrow.

For this you'd need a technological moat. So far the forerunners have burned a lot of money with no moat in sight. Right now Europe is happy just contributing on research and doing the bare-minimum to maintain the know-how. Building a frontier model would be lobbing money into the incinerator for something that will be outdated tomorrow. European investors are too careful for that - and in this case seem to be right.

tensor1 day ago
> Indeed, even if one isn't partial to China, there's reasons to be glad that an increasingly hostile US has powerful competition.

This is how I see it. The US has openly threatened multiple times to annex my country, and has repeatedly threatened every western nation. Letting the US have a monopoly on... well.. anything, is really bad for the world. The more countries that have their own production for various critical things like computer chips, medicine, etc, the better it is for the world at it distributes power.

People in the US don't seem to understand that with the current administration the US is seen as a potentially very hostile nation. While I don't think China is a friend to Canada or the west, at least it provides alternatives when the US tries to use it's monopolies against us. And vice versa too.

>Building a frontier model would be lobbing money into the incinerator for something that will be outdated tomorrow. European investors are too careful for that - and in this case seem to be right.

Strong disagree here. Mistral does great work, in the long term being a few months or even a year behind is a non-issue. Also Cohere just merged with Aleph Alpha to continue producing foundational models. It's extremely important that the middle powers continue to do this.

benterix2 days ago
Yeah it's confusing. I mean China has work camps for Uighurs and is very brutal on Tibetans etc. OTOH, their leader is not setting the world on fire every second week and compared to Trump seems like the paragon of reason on the surface. Of course we know it's a facade but man what crazy times to live in.
2ndorderthought2 days ago
I don't see the issue. China hosts the alternatives or the only game in town for lots of technologies. China has every interest and right to create products. Not everything that comes out of China is some devious plan to do terrible things. It's people trying to make money just like you and me.

I am not washing away the authoritarianism, but take a look at other economic super powers directionality. Or that of tech ceos as well. At least Chinese tech companies aren't going around praising wwii Germany, writing manifestos, and bombing children at school or fisherman on whims. It is difficult not to see more countries regardless of leadership putting their hat in the ring as a net positive. Especially if it increases sustainability and lowers the price, which this very clearly does. It's even open source...

Cthulhu_2 days ago
Moral stances aside, I'd argue it's healthy that the US gets competition from abroad. I appreciate the boost that the world is getting from China - infrastructure and construction projects are a huge benefit to economies. Their focus on green energy has caused a huge influx of affordable solar panels, home batteries, EVs, etcetera, helping reduce the dependency on fossil fuels - while the US and especially the other big money spenders in the middle east would rather the world remain fully dependent on them. But for the past years Europe and now Asia are feeling the pain from being overly reliant on that.

China's policies and government aren't morally defensible and I do fear that they will become more aggressive in spreading their influence and policies onto other countries, but from an economic standpoint what they're doing is super effective. While the previous world power (the US) is stuck in infighting and going through cycles of fixing/undoing the previous administration's damages, instead of planning ahead.

amunozo2 days ago
Competition with the Soviet Union gave all the workers in the world better conditions, also advances in science and technology... (And risk of mutual destruction ;)), even if the USSR wasn't good.
chvid2 days ago
The important thing is that LLMs are well-dispersed and the technology is relative open, much more open than it could have been. Alternative worthwhile LLMs will emerge from Europe and other non-US western countries once the economic incentives are there.
thedelanyo1 day ago
> mercy of authoritarian regimes.

Yet, it's the democratic regime which is causing all the chaos around the world and disrespecting the leadership of other jurisdictions, just to keep pushing the petrol dollar going up.

Do we ever think there's any subtle difference between authoritarian and democratic? Where democracy ultimately makes the world a better place?

LeFantome1 day ago
Thankfully, DeepSeek is the most open of the model providers.

And in the hardware side, RISC-V is gaining a lot of traction in China. So the dependency on a single supplier is lower with the Chinese tech stack than with most western options.

parthdesai2 days ago
> If China ever feels emboldened enough to go for Taiwan and the US descends into complete chaos, the rest of the world running on AI will be at the mercy of authoritarian regimes.

Alternative being the current reality and world being dominated by US. Let's ask people in Middle East/Asia/South America about how they feel about that. In this current day and age, how is this statement even relevant?

SgtBastard2 days ago
Mistral (a French company) shouldn’t be discounted.
zobzu2 days ago
bold to think half the comments here arent from deepseek itself :)

I personally love the bit "us initiated tech war" lol. thats right, they started making AI its their fault! bad imperialist US !

yeah, v5 will do better

mft_2 days ago
You’re right… but that’s on the rest of the world not getting their shit together.

It’s this sort of example (and not properly supporting Ukraine, and not agreeing how to collectively deal with migrants, and not agreeing how to coordinate defence, and myriad other examples) that highlights what a pointless mess the EU is. It’s not a unified block - it’s 27 self-interested entities squabbling and playing petty power games, while totally failing to plan for the future with vision.

The EU could/should have ensured that a European equivalent to OpenAI or Anthropic could thrive, and had competitive frontier models already; instead, they’re years and countless billions behind.

simgt2 days ago
The EU pouring even more billions in this would just have meant pouring billions on US tech. China is winning on all fronts at this game because of the embargo, they end up even more vertically integrated as a result of it.
cde-v2 days ago
China doesn't even care about Taiwan anymore, their saber-rattling about it is a convenient distraction while they quietly make it completely irrelevant in the next few years.
8note1 day ago
china is gonna care about taiwan as a means of ocean access til the end of time, or til the tectonic plates move to make different opportunities.

the people and industry arent what matter there

iso16312 days ago
It does seem the idea is to get the Taiwanese people to want to choose to rejoin China by making China far better for people to live than Taiwan. Maybe that will be via democracy (i.e. China manipulates the people of Taiwan), or perhaps it will be genuine (i.e. China provides a far better lifestyle for the average person than Taiwan)
Danox2 days ago
Isn’t Mistral close in the ballpark?
2ndorderthought2 days ago
Mistral has a different focus. They aren't taking on trillions in debt risking their entire economy to produce useful products.

I think they are leaders in the democratization of LLMs. Almost everyone has a computer right now that can run a useful variant of a Mistral model. I hope they keep their focus because what they are aiming for likely has the biggest impact on the average person and would be the best case scenario for the technology in general.

Lapel27422 days ago
AFAIK: Current Mistral models are not competitive with SOTA-models that come out of the USA or China. They are "good enough" for enterprise usage when you don't need SOTA performance.

Their main selling point is: They are neither US-American nor Chinese. That's a real moat in today's world. I think at the moment they feel quite comfortable.

techsystems2 days ago
There are no European models that come close. It's Korean models, then a UAE model K2, then Mistral.
eunos2 days ago
They arent. Benchmark wise they are quite apart.
AntiUSAbah2 days ago
Come on... I was hoping that Mistral would do something and man that would be great as european but I hear NOTHING from them ever.

I don't know what the problem is. Are we europeans to stupid? Do we just not have enough money / VC money? Are we not proud enough?

:(

victorbjorklund2 days ago
Not worse than having our stuff built there. Is it great to be relying on them? No, but at least more stable than US under Trump.
Scroll_Swe1 day ago
>- it is just a beautiful thing to see it slowly fall apart.

I feel uneasy over China dominance as much as the US.

I trust US more still as Europe has a post WW2 relationship. I notice many comments being pro China but they seem to be from the third world (one mentioned a very low salary) I feel the opening of the internet was a mistake.

China is a totilitarian dictatorship. This is a fact.

Look into Mistral AI too :)

For context, I am Swedish.

Yes this is a new account, please focus on the content.

riskd1 day ago
Are third world users opinions of lesser value?
glenstein1 day ago
Are there distinct third world opinions in one direction or the other? I've tended to assume they are non-unitary rather than broadly converging on one side or the other.
GorbachevyChase1 day ago
Come on, Sweden isn’t quite a 3rd world country.
Scroll_Swe1 day ago
When people from developing countries praise China and communism while criticizing the United States and claiming “Europe is the same,” I find it hard to take their views seriously.

I think their stance often comes from a strong anti-Western bias, and sometimes from feelings of resentment.

0x7373681 day ago
And then westerners wonder why they're disliked in the rest of the world...
pb71 day ago
You are on a Western website.
torginus1 day ago
The content we see here is that you are Swedish. I am not sure what sort of moral, technical, financial authority are we supposed to be deriving from this.

Dont get me wrong, Sweden is a cool country, but still my point stands.

bigyabai1 day ago
FWIW, I am a lifelong American citizen and I exclusively use Chinese AI models for programming because I consider Claude and Codex to be highway robbery for the price.

Trust whoever you want, I just don't have the patience (or money) for American models.

platinumrad1 day ago
> I notice many comments being pro China but they seem to be from the third world (one mentioned a very low salary) I feel the opening of the internet was a mistake.

Yeah, I also really hate when poor people think they're allowed to talk.

Der_Einzige1 day ago
Opening of the internet WAS a mistake. During times when whole countries (you know which ones) get geoblocked, the internet (especially online gaming) gets a lot better.
shimman1 day ago
Honestly the China scare mongering is borderline hilarious. The US has literally attacked two countries this year in the span of weeks and is blockading another causing needless deaths. Not too mention the last 50 years of US imperialism making the world a worse place for everyone, except to benefit the few (doesn't benefit Americans, only capitalists).

The idea that China is worse than America is laughable. LMK when China invades 5 countries in a span of 20 years unimpeded by anyone else in the world and maybe I'll be scared.

Until then it's quite clear how consumers benefit from actual competition and it's not because of the US.

Also you saying you trust the US when they just threatened to invade Greenland (a threat so credible that Denmark was planning a full scale resistance against US troops).

Sorry but the curtains are truly coming down and the US will become one of the most hated nations in the world while 100s of millions will needlessly starve and die because of the actions of Americans that simply don't give a fuck.

FWIW, I'm not just talking about Trump either. Democratic politicians are just as much to blame, they champion corporatism and imperialism as much as Republicans and the only issues D leadership seems to have is that the "right process" wasn't being followed.

I say this as someone who is a literal democratic operative within the party.

glenstein1 day ago
One party authoritarian dictatorship with no free speech or democratic elections and no civil rights movement seems pretty bad to me. No amount of whataboutism is ever going to compete with that.

It also seems like clashes with India, every southeast asian country with internationally recognized territory rights in the South China sea, the forcible takeover of Hong Kong, arming and economically supporting Russia, Pakistan and Iran are bad, and the increasing probability of a hot war to take over Taiwan should count as bad, perhaps the most urgently dangerous threat to global peace in the 21st century.

The United States track record post WW2 is a complicated combination of monstrously immoral Kissenger and Bush style overthrows of democracies and genuinely valuable maintenance of a post WW2 democratic order focused on things like free speech and human rights. I stay with full sincerity that in the decade plus that I've been here on hn seeing whataboutism as a strategy for defending China, I'm yet to encounter anything that feels like a sincere engagement with United States role in the world as a combination of positives and negatives, it's always flatly one-sided messaging that feels like it's aimed at a favorable audience that already agree rather than like it's sincerely attempting to persuade.

scottyah1 day ago
So you're good with the takeover of Hong Kong and what they're doing with the Uyghurs? I think you're getting a pretty biased feed of news. I'm not saying China is the devil, but the trite "USA bad because [overhyped recent news]" is a crazy take. There's plenty of bad stuff that has been done by Americans you could have called out.
avazhi1 day ago
Did you miss the part where Iran spent the past 50 years promising to develop nuclear weapons and then use them on both Israel and America, or do just choose to conveniently ignore that when you go on rants like this? In one of the last rounds of talks before the war Kushner and Witkoff offered Iran free nuclear fuel in perpetuity in exchange for the weapons grade uranium and got turned down, so clearly the Iranians weren’t just bluffing.

This war could have been handled much differently and better, but acting like America attacked Iran for no reason is laughable. It is in fact America’s inexplicable reticence to kill Iranian civilians that is the reason this is going on for this long. America could have ended this in a few days if it had stopped worrying about being criticised by the rest of the world that hates it anyway.

https://www.nytimes.com/2026/04/07/us/politics/trump-iran-wa...

tigershark1 day ago
China never threatened this: "A whole civilization will die tonight, never to be brought back again.". Also China never announced that it was going to attack Europe. I trust them much more compared to a malignant narcissist that doesn't care if the whole world burns now that he doesn't have much left to live.
nozzlegear1 day ago
Xi Jinping really doesn't have to do anything but sit back and let Trump make a pro-China argument for him. It's like that "Do Nothing, win" meme:

https://i.kym-cdn.com/photos/images/original/002/352/212/95b...

pckd1 day ago
There is no morality at Country level. The talk about values, morality, just world is only a lip service and pretty much every smart person knows this. If you still want to hold a country for moral standards, our own dear USA's standards would be pathetically low. One example - we force and demand every other country to use USD as the reserve currency. If anyone considers the alternative, we follow the usual routines (bombing, hacking, kidnapping, Tarrifs, coercing, currency sabotage, etc). If two countries in some remote corner of world want to exchange goods and transact in their own local currency, what legal rights does the USA have to stop it and force them to use USD? and punish them if they do not listen? Just because we are aligned with Western Europe, do not assume moral high grounds.
nozzlegear1 day ago
> If two countries in some remote corner of world want to exchange goods and transact in their own local currency, what legal rights does the USA have to stop it and force them to use USD?

China and Russia trade in yuan and rubles. India and Russia do oil deals in rupees. China and Brazil trade in yuan. The US hasn't bombed any of them.

hnsdev1 day ago
I don't see any sense in trusting the US more than China. There are arguably as many arguments to say the US is horrible as the current dominant country as China would be. If anything, a multipolar world would be more positive, specially to the EU, as currently the EU is just US's bitch, and has to live by appeasing Mr. Donny, as done in the stupid trade deal signed by Von Der Leyen.

Also, feeling the opening of the internet as a mistake show the degree of your ignorance, people from third world countries also have the right to speak as much as you do, your opinion is not more valid than anyone else's.

For context, I am Italian-Brazilian, so I pretty much have been exposed to both sides (western and non-western, even though we can argue that Brazil is more west aligned).

spaceman_20202 days ago
I've been baffled watching America double down on the same strategy even when it failed to produce results

They sanctioned the hell out of Huawei and now Huawei is bigger than ever

America is just not able to digest the idea that another country can be as good, if not better, at innovation

hirako20002 days ago
Deeper than the inability to digest. The incapability to comprehend it.

China's fall in the 19th century came at them for the same reason. How could these European savages be stronger, thus better than us? Our intelligence service must be out of their mind.

nipponese2 days ago
Because it worked on Japan in the 80s and 90s and sometimes “Americans” have a hard time telling the two cultures apart.
segmondy2 days ago
It's not about 2 cultures, but 2 timelines. China has seen the game and adapted, they will not respond with prior losing responses. Meanwhile, America is playing the same moves because it worked in the past.
spaceman_20201 day ago
Weird why Americans would think that the coercion that worked against an essentially vassal state with no independent military would work against a non-aligned nuclear powered state with a strong, independent military

Sovereign and non-sovereign nations have completely different decision matrices for dealing with external threats

jatora2 days ago
I'm no huge fan of America, but claiming China is as good or better at innovation is asinine.

It costs 100-1000x less manpower, money, and time to hug the heels of innovators than to actually pioneer. Say what you will about America but they absolutely lead technological innovation and it's not even remotely close.

spaceman_20201 day ago
Yeah, because the Americans had a 150 year headstart

China had literally 60M people die in a famine when JFK was president and Elvis was the biggest thing. The country was basically farmland and basic industries 40 years ago

Why would you even compare their capabilities today vs a country that has been a sovereign nation for 250 years?

You look at trajectories, not the present

2ndorderthought2 days ago
America has been making short term and short sighted moves to try to widen a gap that cannot sustain. They have chosen the wrong strategy out of fear and greed. Cooperation is the right strategy. Isolationism will not work in the long term except for maybe the handful that drove it. The irony is that it's an anticompetitive and anticapitalist move to do what they have been doing, so it's not even on principal.
srameshc2 days ago
As much I apprecite the sentiment, I think it is too early to declare that the well guareded monopoly is over. Yes, these models have answers, but don't expect all the large enterprises to switch to these models. The other aspect is scaling to serve these models will need a lot of time even if Huawei succeeds. Not all the Governments trust China and there will be a lot of resistance to work with these models eventually, even if cheaper.
segmondy2 days ago
Which Monopoly? Are all large enterprises in USA? There are tons of them outside and they will run the open ones and cheapest ones to infer and those are Chinese. I run Chinese models at home and don't bother with cloud. If I could call the shots at work, we will switch 100% to Chinese models so everyone could have "unlimited" tokens.
rapind2 days ago
You might be underestimating how significantly cheaper this is and how much people care about price.

Walmart is a horrible company owned by horrible people and yet it’s cheap so it dominates.

If the quality really is in the Opus 4.6 range (considering how bad 4.7 is), then it’s a pretty big deal.

ai-x1 day ago
Can you exactly point to me how US Tech firms are "falling apart".

Deepseek is a mid model. not SOTA.

nazgulsenpai1 day ago
This thread really exploded into partisan geopolitics. Sad to see. And I agree. This whole ecosystem of tech monopolies is a negative from just about every POV except the government, the investor, and the companies themselves.
maxdo2 days ago
This model is dead on arrival.

It’s a burned ccp money at this point . They will not be able to serve it until H2 2026 . Even at this point if you look at opus 4.7 and gpt 5.5 this model is just mediocre.

By the time they can serve it nobody will care at all.

michaelmrose2 days ago
Multiple independent implementations inherently virtuous. After all each individual party may innovate in ways that benefit everyone ultimately.

Also it's tech they can be sure we can't cut them out of or tariff and money flowing from Chinese companies to other Chinese companies which we appreciate the benefits of when the shoe is on the other foot.

reactordev2 days ago
I think you missed the bigger picture here. It’s that China has their own stack now, soon others will follow. It’s not about putting up the highest numbers, it’s about putting up the highest ROI. To them, this is it. Qwen too but being able to compete with today’s models means they are closer to competing with tomorrow’s.
scottyah1 day ago
At this scale, it's purely quality. The better the model, the faster the advancements. If using a model half as smart as the best made us half as productive, people would pretty much all be using the current quantized models that can run on a decent laptop. The difference between Opus xHigh and Gemma4 is very different (at least in my job).
torginus1 day ago
I'm kinda baffled by this whole belief system, that instead of seeing that other guys on the other side of the planet have managed to do what is generally though to be the pinnacle of Western engineering & investment with the fraction of the resources, and maybe improve upon it in some way, and their conclusion isn't 'maybe this stuff isn't as hard, and could do much better, or at least do the same thing the Deepseek guys did', but it devolves into this weird nationalist shtflinging great power competition thing, as if these models were the result of deliberate nation-state level coordination of government and industry like the space program.

For me as a consumer, competition is good - that means companies have less leverage over me, which is beneficial even if I decided to never use a Chinese model ever.

gxs1 day ago
If you look at the past 3-4 decades, China has just played their cards so well

If/when they overtake the US, all things aside, they deserve it. There is no world where the US overtakes China but there’s a world where China overtakes the US. Best outcome for the US atm is parity.

Just remarkable the things they’ve accomplished in the time they’ve accomplished them.

jmyeet2 days ago
These have been my predictions since at least the first release of DeepSeek-R1 over a year ago:

1. There will be no moat where one company "owns" AI. China will see to that. It's simply too much in their national interest for that not to happen;

2. This is incredibly bad news for OpenAI who have raised so much money with so (comparabley( little revenue that the only way they can get a return on that is to "win" and be that company that "owns" AI; and

3. China's chipmaking will catch up with Taiwan within the next decade (with commercial EUV at scale within 5 years). I liken this to American hubris over the development of the atomic bomb where in 1945 many American leaders and military thought the USSR would either never get the atomic bomb or it would take 20+ years. It took 4. And they USSR's first hydrogen bomb was detonated a year after the US's.

Whereas the USSR did this with espionage. times have changed. Now all China has to do is throw a few million dollars at hiring the right people froM ASML and elsewhere. China has the track record of delivering on long term projects. Closing the lithography gap will be no different.

scottyah1 day ago
Espionage has changed wildly, and the ease of taking out key people in "accidents" has dramatically increased.
lanthissa2 days ago
not really, china has gone domestic for everything as soon as it could.

its naive to think they would have stayed on a 'western' stack.

Most of the time 'losing' isn't making a bad choice its being put in a situation where you have no good choices.

IncreasePosts1 day ago
Deepseek is distilled from other SOTA models. Without them, deepseek would not be possible.
AndrewKemendo1 day ago
I just wished more Chinese companies would start setting up shop outside of China so that we could all work for them

I’ve talked to the folks over at Unitree multiple times and they say “yeah we’ll be hiring overseas soon” and then they never do and they only have five openings in China

shimman1 day ago
They are, plenty of BYD factoring being built throughout South America and Southeast Asia as a condition of opening trade. Same is starting to happen in Europe too.

You just aren't going to this too much in the US or any countries fully aligned with the US for fear of competition. It doesn't benefit anyone really. It's not like I get richer when Ford says more vehicles or Meta makes more teenagers suicidal, so why should we care? It'll hurt the country in the long run too.

scottyah1 day ago
You had a chance with Bytedance. It didn't sound too great though, there was a very hard glass ceiling for all non-chinese according to Blind.
GorbachevyChase1 day ago
The PRC government operates extrajudicial police forces outside their borders to keep the diaspora in line. I think they disappeared Jack Ma for a while. I suspect there’s something like that that goes on in the US, but I don’t have strong evidence for that.
AndrewKemendo1 day ago
I’d take my chances with that without issue
philipallstar2 days ago
It's not a tech war. America built China's capability through outsourcing manufacturing. It's hardly a war.
rvz2 days ago
The paper is here: [0]

Was expecting that the release would be this month [1], since everyone forgot about it and not reading the papers they were releasing and 7 days later here we have it.

One of the key points of this model to look at is the optimization that DeepSeek made with the residual design of the neural network architecture of the LLM, which is manifold-constrained hyper-connections (mHC) which is from this paper [2], which makes this possible to efficiently train it, especially with its hybrid attention mechanism designed for this.

There was not that much discussion around it some months ago here [3] about it but again this is a recommended read of the paper.

I wouldn't trust the benchmarks directly, but would wait for others to try it for themselves to see if it matches the performance of frontier models.

Either way, this is why Anthropic wants to ban open weight models and I cannot wait for the quantized versions to release momentarily.

[0] https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

[1] https://news.ycombinator.com/item?id=47793880

[2] https://arxiv.org/abs/2512.24880

[3] https://news.ycombinator.com/item?id=46452172

jeswin2 days ago
> this is why Anthropic wants to ban open weight models

Do you have a source?

louiereederson2 days ago
More like he wants to ban accelerator chip sales to China, which may be about “national security” or self preservation against a different model for AI development which also happens to be an existential threat to Anthropic. Maybe those alternatives are actually one and the same to him.
HarHarVeryFunny2 days ago
Annecotal, but I saw a tweet from someone who interviewed at Anthropic, and was explicity rejected because of cultural mismatch because they were not against open weight models.

It's hard not to see Anthropic's messaging of "this tech that we're pushing on you is going to take your job and maybe kill you" as being about anything other than regulatory capture, with the goal of the government shutting down competitors.

I think OpenAI and Anthropic are both really in a tough spot - spending so much on what is becoming a commodity product for which neither seems positioned to be low cost producer. Maybe a bit like the UK-France channel tunnel project where the product itself is a success but a bloodbath for those who invested to build it.

Kuyawa1 day ago
I am using DeepSeek extensively to develop apps, three in the last month, with my own CLI coding agent [1] developed by DeepSeek itself line by line. I haven't spent $1 yet in well over 10 million tokens.

If I considered myself a 10X programmer, now I am 100X. Love DeepSeek.

[1] https://github.com/kuyawa/mecha-ai

edg5000about 21 hours ago
Have you compared it against other coding agents? What is your general workflow with DeepSeek; do you write a spec and then have it implement and test? Very interesting to hear. Becuase your harness is adapted to DeepSeek, you probably prompt and it very differently; since its adapted to the model this may explain why it works well for you. Wiring up an existing harness that is not tested on DeepSeek may not yield optimal results.
lobo_tuerto2 days ago
Glad to see most of the comments here were kept on-topic and didn't deviate at all into geopolitical discussion.
zkmon2 days ago
They released 1.6 T pro base model on huggingface. First time I'm seeing a "T" model here.
mzl2 days ago
Kimi K2.5 and K2.6 are both >1T
jessepcc2 days ago
At this point 'frontier model release' is a monthly cadence, Kimi 2.6 Claude 4.6 GPT 5.5, the interesting question is which evals will still be meaningful in 6 months.
mixtureoftakes2 days ago
more like weekly or almost daily, gpt 5.5 was literally 12 hours ago
dizhn2 days ago
I like deepseek. It works very well. I haven't tried v4 yet but on their web chat interface, just typing "Taiwan" causes it to give you a lecture about how Taiwan is part of China. :)
jyscao2 days ago
What a gotcha
RALaBarge2 days ago
Jingoism: Its such a rush!
kroaton2 days ago
Ask western models about Israel's genocides and mass rapes in Palestine, Lebanon, etc.
intrasight2 days ago
It's open source, so just delete those parameters. /s
coderssh2 days ago
Feels like the real story here is cost/performance tradeoff rather than raw capability. Benchmarks keep moving incrementally, but efficiency gains like this actually change who can afford to build on top.
Advertisement
rohanm932 days ago
This is shockingly cheap for a near frontier model. This is insane.

For context, for an agent we're working on, we're using 5-mini, which is $2/1m tokens. This is $0.30/1m tokens. And it's Opus 4.6 level - this can't be real.

I am uncomfortable about sending user data which may contain PII to their servers in China so I won't be using this as appealing as it sounds. I need this to come to a US-hosted environment at an equivalent price.

Hosting this on my own + renting GPUs is much more expensive than DeepSeek's quoted price, so not an option.

esperent2 days ago
> I am uncomfortable about sending user data which may contain PII to their servers in China

As a European I feel deeply uncomfortable about sending data to US companies where I know for sure that the government has access to it.

I also feel uncomfortable sending it to China.

If you'd asked me ten years ago which one made me more uncomfortable. China.

But now I'm not so sure, in fact I'm starting to lean towards the US as being the major risk.

tiahura2 days ago
The chances of my bank account getting hacked due to the PLA backdoor in Deepseek is higher than the CIA backdoor in OpenAI.
fractalf2 days ago
Right now Im much more worried about sending data to the US and A.. At least theres a less chanse it will be missused against -me-
swiftcoder2 days ago
> For context, for an agent we're working on, we're using 5-mini, which is $2/1m tokens. This is $0.30/1m tokens. And it's Opus 4.6 level - this can't be real.

It's doesn't seem all that out there compared to the other Chinese model price/performance? Kimi2.6 is cheaper even than this, and is pretty close in performance

rohanm932 days ago
Kimi is indeed somewhat cheap for frontier-level intelligence, but still is $4-5 per mm tokens. Deep Seek is at least an order of magnitude cheaper.
swiftcoder2 days ago
Oh, right you are. I misread where the decimal place was in the Deepseek pricing. That is incredibly cheap
gordonhart1 day ago
Since it's open weights it'll be available on AWS Bedrock soon(ish), likely at a higher price than the official API but still coming in under those GPT-5-mini prices.
rohanm931 day ago
Interesting, thanks. I'll keep an eye out.
Havoc1 day ago
Tried running it over some code as a secondary review and so far very impressed. Will definitely keep using it for that. Seems to pick up different issues than other models.

With DS tech though the worry is generally more capacity. Haven't seen issues with v4 but in the past their combination of quality and pricing means they get overloaded.

quadruple2 days ago
In their paper, point 5.2.5 talks about their sandboxing platform(DeepSeek Elastic Compute). It seems like they have 4 different execution methods: function calls, container, microVM and fullVM.

This is a pretty interesting thing they've built in my opinion, and not something I'd expect to be buried in the model paper like this. Does anyone have any details about it? Google doesn't seem to find anything of note, and I'd love to dive a bit deeper into DSec.

mrinterweb1 day ago
I'm too concerned with data exfiltration to use many AI services unless their terms of service state they will not use your data for training or anything else. Zero retention is what I'm looking for. I care because I frequently work on proprietary code that I do not personally own (as most employed software devs do). So if I am using an AI service with proprietary code, I want assurances that there is no retention and no training happening. From my American perspective Chinese companies don't have the best track record of not training on proprietary information. I guess LLMs in general are trained on a lot of proprietary information. I just don't want to be responsible for unintentionally exfiltrating my employer's proprietary code.
XCSme2 days ago
Something is odd with this model, their blog posts shows REALLY good results, but in most other third-party benchmarks, people realize it's not really SOTA, even bellow Kimi K2.6 and GLM-5/5.1

In my tests too[0], it doesn't reach top 10. One issue, which they also mentioned in their post, is that they can't really serve well the model at the moment, so V4-Pro is heavily rate-limited and gives a lot of timeout errors when I try to test it. This shouldn't be an issue though, considering the model is open-source, but it makes it hard to accurately test at the moment.

[0]: https://aibenchy.com/compare/deepseek-deepseek-v4-flash-high...

Oras2 days ago
I used pro via API (DeepSeek API not OpenRouter) with Claude Code, and the planning, visual solution, understanding was fantastic.

I would say I wouldn't notice this wasn't Opus 4.6. What I asked was looking at a feature implemented recently, and how it could be improved. Consumed 3.3 million tokens and create a much better flow.

It had a bug when I started the implementation though related to the API, which I suppose it is something they didn't catch when making their API compatible with CC.

dannyw2 days ago
Hmm, the Flash performs significantly better than Pro in the benchmark? That's very strange; could rate limiting cause that?
XCSme2 days ago
Yes, Flash doesn't seem to have the same rate limits as Pro.

I expect once the API issues are fixed, for v4-pro to be around the same level as GLM-5.

wolttam2 days ago
Why would your test be including scores of failed responses/runs? That seems confusing.

(I am confused by the results your website is presenting)

coder5432 days ago
Your “benchmark” is invalid. Penalizing the model because the hosting environment is being DDoSed by users a few hours after launch is utter nonsense.

I see that you tried to justify this lower in the thread, but no… it completely invalidates your benchmark. You are not testing the model. You are conflating one specific model host and model performance, and then claiming you are benchmarking the model. All major models are hosted by multiple different services.

In the real world, clients will just retry if there is a server error, and that will not impact response quality at all, and the workflow the model is being used in will not fail. If a workflow is so poorly coded that it doesn’t even have retry logic, then that workflow is doomed no matter which host you use. But again, reliability of the host is separate from the model.

You can make your benchmark valid by having separate leaderboards for model quality and host reliability. I’m not saying to throw the whole thing away. But the current claim is not valid.

And you’re also making an unsourced claim that everyone else has already determined this model sucks? Nah. The first result from Artificial Analysis shows good things: https://x.com/ArtificialAnlys/status/2047547434809880611

But I am still waiting to see the results from the full suite of AA benchmarks.

BoorishBears1 day ago
Their benchmark is full of nonsense like this and I'm amazed the fact most of their interactions on the site are promoting it hasn't gotten the account banned for spam.

They have Gemini 2.5 Flash ahead of Opus 4.6: https://aibenchy.com/compare/anthropic-claude-opus-4-6-mediu...

Absolutely worthless benchmark but every release has a comment linking to this nonsense.

embedding-shape1 day ago
> V4-Pro is heavily rate-limited and gives a lot of timeout errors when I try to test it. This shouldn't be an issue though, considering the model is open-source

Why does it matter if the model/architecture/weights are open source or not, given it's their proprietary inference hardware they're currently having issues with? Proprietary or not, the same issue would still be there on their platform.

XCSme1 day ago
It depends...

If the conclusion is: "DeepSeek v4 is this good, if you use it from DeepSeek" (which is how most people would use it anyway), then it makes sense to count API errors as failures.

But, if the conclusion must be "The DeepSeek v4 model is this good when self-hosted and ran at ideal conditions", then the model should be tested locally, and skipping all invalid calls.

I am still debating what should I do in this case, because showing a model as #1, and then people try to use it from their official provider and it fails half of the time, then that's also not a good leaderboard.

I am considering adding a "reliability" column. Retry API errors until the test completes, BUT track how many retries was needed and compute a separate reliability score. But here comes a different problem: reliability varies over time and providers, so that's tougher to test.

embedding-shape1 day ago
Sounds like you're mixing and trying to measure two very different things, but placing them in the same category. One is the model itself, then there are reference conditions, and no such thing as "API failure". The other one is the reliability and uptime of a remote API endpoint for LLM inference.

If you want to measure their API, do so, but don't place it under the same category as testing the model itself, as they're two different metrics.

simonw2 days ago
I like the pelican I got out of deepseek-v4-flash more than the one I got from deepseek-v4-pro.

https://simonwillison.net/2026/Apr/24/deepseek-v4/

Both generated using OpenRouter.

For comparison, here's what I got from DeepSeek 3.2 back in December: https://simonwillison.net/2025/Dec/1/deepseek-v32/

And DeepSeek 3.1 in August: https://simonwillison.net/2025/Aug/22/deepseek-31/

And DeepSeek v3-0324 in March last year: https://simonwillison.net/2025/Mar/24/deepseek/

JSR_FDED2 days ago
No way. The Pro pelican is fatter, has a customized front fork, and the sun is shining! He’s definitely living the best life.
chronogram2 days ago
The pro pelican is a work of art! It goes dimensions that no other LLM has gone before.
w4yai2 days ago
yeah. look at these 4 feathers (?) on his bum too.
oliver2362 days ago
a lot of dumplings
nickvec2 days ago
The Flash one is pretty impressive. Might be my favorite so far in the pelican-riding-a-bicycle series
torginus2 days ago
This is just a random thought, but have you tried doing an 'agentic' pelican?

As in have the model consider its generated SVG, and gradually refine it, using its knowledge of the relative positions and proportions of the shapes generated, and have it spin for a while, and hopefully the end result will be better than just oneshotting it.

Or maybe going even one step further - most modern models have tool use and image recognition capabilities - what if you have it generate an SVG (or parts/layers of it, as per the model's discretion) and feed it back to itself via image recognition, and then improve on the result.

I think it'd be interesting to see, as for a lot of models, their oneshot capability in coding is not necessarily corellated with their in-harness ability, the latter which really matters.

simonw2 days ago
I tried that for the GPT-5 launch - a self-improving loop that renders the SVG, looks at it and tries again - and the results were surprisingly disappointing.

I should try it again with the more recent models.

torginus2 days ago
I see, thanks. I guess most current models are not yet trained for this loop.

Could you please try with Opus 4.7? I think there's a chance of it doing well, considering the design/vision focus.

murkt2 days ago
DeepSeek pelicans are the angriest pelicans I’ve seen so far.
kristopolous2 days ago
they're just late for work.
muyuu2 days ago
They're stressed pelicans from Hangzhou.
lazycatjumping2 days ago
996 Pelican, lol
mikae12 days ago
Being a bicycle geometry nerd I always look at the bicycle first.

Let me tell you how much the Pro one sucks... It looks like failed Pedersen[1]. The rear wheel intersects with the bottom bracket, so it wouldn't even roll. Or rather, this bike couldn't exist.

The flash one looks surprisingly correct with some wild fork offset and the slackest of seat tubes. It's got some lowrider[2] aspirations with the small wheels, but with longer, Rivendellish[3], chainstays. The seat post has different angle than the seat tube, so good luck lowering that.

[1] https://en.wikipedia.org/wiki/Pedersen_bicycle

[2] https://en.wikipedia.org/wiki/Lowrider_bicycle

[3] https://www.rivbike.com/

simonw2 days ago
This is an excellent comment. Thanks for this - I've only ever thought about whether the frame is the right shape, I never thought about how different illustrations might map to different bicycle categories.
mikae12 days ago
Some other reactions:

I wonder which model will try some more common spoke lacing patterns. Right now there seems to be a preference for radial lacing, which is not super common (but simple to draw). The Flash and Pro one uses 16 spoke rims, which actually exist[1] but are not super common.

The Pro model fails badly at the spokes. Heck, the spokes sit on the outside of the drive side of the rim and tire. Have a nice ride riding on the spokes (instead of the tire) welded to the side of your rim.

Both bikes have the drive side on the left, which is very very uncommon. That can't exist in the training data.

[1] https://cicli-berlinetta.com/product/campagnolo-shamal-16-sp...

jojobas2 days ago
The Pedersen looks like someone failed the "draw a bicycle" test and decided to adjust the universe.
catelm2 days ago
I think the pelican on a bike is known widely enough that of seizes to be useful as a benchmark. There is even a pelican briefly appearing in the promo video of GPT-5, if I'm not mistaken https://openai.com/gpt-5/. So the companies are apparently aware of it.
simonw2 days ago
It was a bigger deal in the Gemini 3.1 launch: https://x.com/JeffDean/status/2024525132266688757
nsoonhui2 days ago
To me this is the perfect proof that

1) LLM is not AGI. Because surely if AGI it would imply that pro would do better than flash?

2) and because of the above, Pelican example is most likely already being benchmaxxed.

brutal_chaos_2 days ago
What was your prompt for the image? Apologies if this should be obvious.
shawn_w2 days ago
>Generate an SVG of a pelican riding a bicycle

at the top of the linked pages.

chvid2 days ago
Is it then Deepseek hosted by Deepseek?

How much does the drawing change if you ask it again?

ycui19862 days ago
I really like the pro version. The pelican is so cute.
theanonymousone2 days ago
Where is the GPT 5.5 Pelican?
culopatin2 days ago
In the 5.5 topic
lobochrome2 days ago
Why they so angry?
EnPissant2 days ago
This should not be the top comment on every model release post. It's getting tiring.
blitzar2 days ago
This should be the bottom comment on the pelican comment on every model release post.
EnPissant2 days ago
Clearly the top comment should be "Imagine a beowulf cluster of Deepseek v4!"
aquir2 days ago
It is great! I asked the question what I always ask of new models ("what would Ian M Banks think about the current state of AI") and it gave me a brilliant answer! Funny enough the answer contained multiple criticisms of his own creators ("Chinese state entities", "Social Credit System").
cmitsakis2 days ago
I just did some quick testing on my own benchmark that tests LLMs as customer support chatbots, and found out that deepseek-v4-flash (scored 90.2%) was better than qwen3.5-27b (89%) and qwen3.5-35b-a3b (89.1%) and roughly equal to gemini-3-flash-preview (90.5%), but deepseek-v4-flash had the lowest cost of all of them by far. Half the cost of gemini-3-flash and an order of magnitude less cost than the qwen models.

Have you noticed the deepseek-v4-pro performing worse than deepseek-v4-flash? It performed even worse than qwen3.5-27b. I found it surprising and I'm wondering if there is a bug on my software because I had to implement sending the `reasoning_content` otherwise the API failed with BadRequestError.

littlestymaar2 days ago
How can a medium-sized model like Deepseek-V4-Flash be cheaper than a much smaller models like Qwen3.5-35B-A3B.

It's five times bigger in both total and active parameters!

Ancapistani1 day ago
I don’t know for sure, but I believe those larger models must be run on nVidia hardware (CUDA), while Deepseek-V4-* can be run on Huawei chips. My assumption is that there is less demand pressure on non-nVidia chips.
sixhobbits2 days ago
I know people don't like Twitter links here but the main link just goes to their main docs site generic 'getting started' page.

The website now has a link to the announcement on Twitter here https://x.com/deepseek_ai/status/2047516922263285776

Copying text of that below

DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.

DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.

DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice.

Try it now at http://chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today!

Tech Report: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

Open Weights: https://huggingface.co/collections/deepseek-ai/deepseek-v4

alpineman2 days ago
Just use xcancel by adding 'cancel' to the link

https://xcancel.com/deepseek_ai/status/2047516922263285776

gardnr2 days ago
865 GB: I am going to need a bigger GPU.
npodbielski2 days ago
Or several bigger GPUs! :)
Advertisement
WhereIsTheTruth2 days ago
Interesting note:

"Due to constraints in high-end compute capacity, the current service capacity for Pro is very limited. After the 950 supernodes are launched at scale in the second half of this year, the price of Pro is expected to be reduced significantly."

So it's going to be even cheaper

Aliabid942 days ago
MMLU-Pro:

Gemini-3.1-Pro at 91.0

Opus-4.6 at 89.1

GPT-5.4, Kimi2.6, and DS-V4-Pro tied at 87.5

Pretty impressive

ant6n2 days ago
Funny how Gemini is theoretically the best -- but in practice all the bugs in the interface mean I don't want to use it anymore. The worst is it forgets context (and lies about it), but it's very unreliable at reading pdfs (and lies about it). There's also no branch, so once the context is lost/polluted, you have to start projects over and build up the context from scratch again.
spaceman_20202 days ago
The sheer number of bugs and lack of meaningful improvements in Google products is a clear counterargument to the AI bull thesis

If AI was so good at coding, why can’t it actually make a usable Gemini/AI Studio app?

barnabee2 days ago
I think Google might just be institutionally incapable of making good UX
hodgehog112 days ago
Most of these tests are one-prompt in nature. I've also noticed issues with the PDF reader in Gemini which was very frustrating, although it is significantly better now than it was even two weeks ago. On the contrary, now GPT-5 seems to be giving me issues.

In my experience, Gemini is the most insightful model for hard problems (particularly math problems that I work on).

Alifatisk1 day ago
You know, with a bit of prompting, you can instruct Gemini to output the state of the conversation into a prompt that you can enter in a new chat and continue where you left off. But now with a fresh context window.
lazycatjumping2 days ago
I gave up on Gemini 3.1 Pro in VSCode after 2 hours. They fully refunded me.
esperent2 days ago
Yeah if I could use Gemini with pi.dev that would be my choice. But Gemini CLI is just so, so bad.
Imanari2 days ago
Just tested it via openrounter in the Pi Coding agent and it regularly fails to use the read and write tool correctly, very disappointing. Anyone know a fix besides prompting "always use the provided tools instead of writing your own call"
rane2 days ago
tariky1 day ago
If you have access to any other model it can create create pi extension that fixes problem. At least worked for me.
Imanari1 day ago
Like a special parser? Would you mind elaborating?
tariky1 day ago
It intercepts json commands and turns them in tool calls
abstracthinking2 days ago
They have just released it, give it some time, they probably haven't pretested it with Pi
Imanari2 days ago
How can they fix it after the release? They would have to retrain/finetune it further, no?
zargon2 days ago
It's only in preview right now. And anyway, yes, models regularly get updated training.

But in this case, it's more likely just to be a tooling issue.

mark33vh2 days ago
Yeah hope they fix this for PI
sergiopreira2 days ago
DeepSeek is commoditizing frontier capability... Opus 4.6-level benchmarks at a fraction of the cost changes also who can access these tools.

Stuff that was prohibitive six months ago is now up for grabs. We keep on working on the infra level now, swithcing models whenever we run out of credits, or want a different result. The question is how do we build context, architecture and ensure the agent is effective and efficient..... wouldn't it be good if we simply used less energy to make these AI calls?

XCSmeabout 20 hours ago
Their API issues seemed to have been resolved, now it does[0] as expected, similar to GLM 5 level.

[0]: https://aibenchy.com/compare/deepseek-deepseek-v4-flash-high...

mchusma1 day ago
At first, I was more excited about the Flash model, but I'm now more excited about the Pro model in many ways. I feel like the Pro model with an Run through unsloth, and with some fine tuning, is gonna be enough for many vertical SaaS applications.

Where previously I was wary to under-provide the intelligence level, I'm now more excited about the idea of being able to give these pretty large intelligent models to my application. The idea that for basically sub-agents, we can fine-tune them, should reasonably expect to perform as well as Opus for a specific subtask of which my applications have many.

In other words, we can run a general-purpose intelligent model, Sonnet or Opus, orchestrating a fleet of, let's say, 30 to 50 of these sub-agents that have been fine-tuned. By doing that, I can get very low pricing versus something that would have occurred if I used Opus or Sonnet for everything.

embedding-shape1 day ago
> The idea that for basically sub-agents, we can fine-tune them, should reasonably expect to perform as well as Opus for a specific subtask of which my applications have many [...] we can run a general-purpose intelligent model, Sonnet or Opus, orchestrating a fleet of, let's say, 30 to 50 of these sub-agents that have been fine-tuned

I've heard so many people saying this for the last year, and even tried doing it myself too, and never seen a successful application of it, nor succeeded myself either with SOTA models that are smart but slow or local models that are dumb but fast (even with beefy hardware).

What makes you believe this is possible in the first place? Every "swarm of agents" implementation I've seen only been able to produce lowest quality of code, most of the time vastly bloated, but surely you must have seen something working in practice that you could share with the rest of us?

dandaka1 day ago
I guess it depends on a task. Opus is already spawning Sonnet/Haiku for simple tasks with a good success rate.
embedding-shape1 day ago
I think "agent spawns weaker agent to do safe edit sometimes" is vastly different than the imagined "general-purpose intelligent model orchestrating a fleet of 50 sub-agents".
sergiotapia2 days ago
Using it with opencode sometimes it generates commands like:

    bash({"command":"gh pr create --title "Improve Calendar module docs and clean up idiomatic Elixir" --body "$(cat <<'EOF'
    Problem
    The Calendar modu...
like generating output, but not actually running the bash command so not creating the PR ultimately. I wonder if it's a model thing, or an opencode thing.
CJefferson2 days ago
What's the current best framework to have a 'claude code' like experience with Deepseek (or in general, an open-source model), if I wanted to play?
deaux2 days ago
TranquilMarmot2 days ago
whoopdeepoo2 days ago
You can use deepseek with Claude code
esperent2 days ago
You can, but does it work well? I assume CC has all kinds of Claude specific prompts in it, wouldn't you be better with a harness designed to be model agnostic like pi.dev or OpenCode?
rane2 days ago
I've been using all Kimi K2.6, gpt-5.4 and now Deepseek v4 (thought not extensively yet) in Claude Code and I can say it works much better than you'd expect. It looks like the system prompt and tools are pulling a lot of weight. Maybe the current models are good enough that you don't need them to be trained for a specific harness.
Alifatisk2 days ago
You can use CC with other models, you aren’t forced to use Claude model.
0x1428572 days ago
claude-code-cli/opencode/codex
wolttam1 day ago
I'm impressed! I've been giving the various open-weight models a particularly gnarly (for my brain, at least) refactoring/cleanup task in my DIY coding harness[0] - essentially, de-spaghettifi the main chat view's update logic, which had grown organically since early 2024.

Kimi 2.6 went hard and left me with a buggy mess. GLM 5.1 hedged and made a 25 line change (but it was an improvement). DS V4 went hard, fixed its issues along the way, and left me with a significantly nicer codebase! (...that I will now be spending some time testing before releasing to the project)

[0]: lmcli (simple, Go, nice UX, MIT licensed, works well with DS V4) https://codeberg.org/mlow/lmcli

luyu_wu2 days ago
For those who didn't check the page yet, it just links to the API docs being updated with the upcoming models, not the actual model release.
talim2 days ago
cmrdporcupine2 days ago
My submission here https://news.ycombinator.com/item?id=47885014 done at the same time was to the weights.

dang, probably the two should be merged and that be the link

culi2 days ago
there's no pinging. Someone's gotta email dang
cmrdporcupine2 days ago
beh. instead of merging they just marked mine as dupe, even tho it was submitted at same time and had (for a long time) about the same votes and a better target page
Advertisement
storus2 days ago
Oh well, I should have bought 2x 512GB RAM MacStudios, not just one :(
muyuu2 days ago
Unironically curious about the performance of this model on unified VRAM machines.
storus1 day ago
Probably usable for chat sessions, unusably slow for agentic coding.
Jgoauh2 days ago
So there are 4 versions (2 models with 2 modes): Flash non thinking, Flash thinking, Pro non thinking, Pro thinking,

Are there comparisons between Pro non thinking and Flash thinking ? i don't really get the use case for Flash thinking and Pro non thinking

xnx2 days ago
Such different time now than early 2025 when people thought Deepaeek was going to kill the market for Nvidia.
antirez2 days ago
Actually the fact the inference of a SOTA model is completely Nvidia-free is the biggest attack to Nvidia every carried so far. Even American frontier AI labs may start to buy Chinese hardware if they need to continue the AI race, they can't keep paying so much money for the GPUs, especially once Huawei training versions of their GPUs will ship.
putlake1 day ago
By "completely Nvidia-free" do you mean Nvidia wasn't used for training nor inference? Because if it's only inference, we know that Opus already can run on TPUs. Not to mention Gemini.
antirez1 day ago
Yep but they don't run on Chinese hardware that is going to be available to everybody and will cost a lot less than NVIDIA stuff. So now you have a full non-US pipeline for AI, and soon they'll have the training GPUs as well.
eunos2 days ago
That's like saying Raytheon would outsource building drones from Saheed makers (don't know who exactly).

Not gonna happen

Ifkaluva2 days ago
They might still kill the market for NVIDIA, if future releases prioritize Huawei chips
jfxia2 days ago
Is V4 still not a multi-modal model?
vitorgrs2 days ago
Not yet... Which is a shame.
gzer01 day ago
Congratulations on the release to the DeepSeek team. An interesting note on the use of CSA and HCA: CSA provides higher-resolution, query-selected memory over 4-token compressed blocks, while HCA provides very low-resolution dense global memory over 128-token blocks. That could be a plausible reason to interleave them: CSA alone risks missing information if the indexer fails, while HCA alone is too lossy for precise retrieval. Still reading through the release, as usual, always appreciate the attention to detail in the technical papers.
lifeisstillgood2 days ago
On a seperate note, I am guessing that all the new models have announced in the space of a few days because the time to train a model is the same for each AI company.

Which strikes me as odd - Inwoukd have assumed someone had an edge in terms of at least 10% extra GPUs.

namenotrequired2 days ago
But why would they all start at the same time?
lifeisstillgood2 days ago
Because they all (if my memory serves) did this release at the same time thing last time. I have not looked into it but I am guessing that not letting one model pull ahead for a month means everyone keeps up - which implies the “stickiness” of any one model is a lot lower than we think
jdeng2 days ago
Excited that the long awaited v4 is finally out. But feel sad that it's not multimodal native.
Alifatisk1 day ago
Was that expected?
impossiblefork2 days ago
After testing this for understanding complex stories, text comprehension is definitely comparable to or better than Sonnet, and definitely better than Microsoft's free stuff. Opus is of course very impressive, especially with how Opus is set up with recursive calls that allow it to make rather complete things as if by magic, but the underlying model probably isn't incredibly much better than this.
clark10132 days ago
Looking forward to DeepSeek Coding Plan
Alifatisk1 day ago
If they offer something close to Z.ai:s coding plan during Christmas, I’ll take it!
m_abdelfattah2 days ago
I came here to say the same :) !
ls6122 days ago
How long does it usually take for folks to make smaller distills of these models? I really want to see how this will do when brought down to a size that will run on a Macbook.
simonw2 days ago
Unsloth often turn them around within a few hours, they might have gone to bed already though!

Keep an eye on https://huggingface.co/unsloth/models

Update ten minutes later: https://huggingface.co/unsloth/DeepSeek-V4-Pro just appeared but doesn't have files in yet, so they are clearly awake and pushing updates.

mohsen12 days ago
EnPissant2 days ago
Those are quants, not distills.
inventor77772 days ago
Weren't there some frameworks recently released to allow Macs to stream weights from fast SSDs and thus fit way more parameters than what would normally fit in RAM?

I have never tried one yet but I am considering trying that for a medium sized model.

simonw2 days ago
I've been calling that the "streaming experts" trick, the key idea is to take advantage of Mixture of Expert models where only a subset of the weights are used for each round of calculations, then load those weights from SSD into RAM for each round.

As I understand it if DeepSeek v4 Pro is a 1.6T, 49B active that means you'd need just 49B in memory, so ~100GB at 16 bit or ~50GB at 8bit quantized.

v4 Flash is 284B, 13B active so might even fit in <32GB.

zozbot2342 days ago
The "active" count is not very meaningful except as a broad measure of sparsity, since the experts in MoE models are chosen per layer. Once you're streaming experts from disk, there's nothing that inherently requires having 49B parameters in memory at once. Of course, the less caching memory does, the higher the performance overhead of fetching from disk.
EnPissant2 days ago
Streaming weights from RAM to GPU for prefill makes sense due to batching and pcie5 x16 is fast enough to make it worthwhile.

Streaming weights from RAM to GPU for decode makes no sense at all because batching requires multiple parallel streams.

Streaming weights from SSD _never_ makes sense because the delta between SSD and RAM is too large. There is no situation where you would not be able to fit a model in RAM and also have useful speeds from SSD.

zargon2 days ago
> ~100GB at 16 bit or ~50GB at 8bit quantized.

V4 is natively mixed FP4 and FP8, so significantly less than that. 50 GB max unquantized.

inventor77772 days ago
Ahh, that actually makes more sense now. (As you can tell, I just skimmed through the READMEs and starred "for later".)

My Mac can fit almost 70B (Q3_K_M) in memory at once, so I really need to try this out soon at maybe Q5-ish.

zozbot2342 days ago
These are more like experiments than a polished release as of yet. And the reduction in throughput is high compared to having the weights in RAM at all times, since you're bottlenecked by the SSD which even at its fastest is much slower than RAM.
the_sleaze_2 days ago
Do you have the links for those? Very interested
inventor77772 days ago
Sure!

Note: these were just two that I starred when I saw them posted here. I have not looked seriously at it at the moment,

https://github.com/danveloper/flash-moe

https://github.com/t8/hypura

Advertisement
sibellavia2 days ago
A few hours after GPT5.5 is wild. Can’t wait to try it.
yanhangyhy2 days ago
somehow i canot open the link. but in their chinese version's release article, in the end ,there is a quote from xunzi(https://en.wikipedia.org/wiki/Xunzi_(philosopher))

"Not seduced by praise, not terrified by slander; following the Way in one's conduct, and rectifying oneself with dignity." (不诱于誉,不恐于诽,率道而行,端然正己)

(It is mainly used to express the way a Confucian gentleman conducts himself in the world. It reminds me of an interview I once watched with an American politician, who said that, at its core, China is still governed through a Confucian meritocratic elite system. It seems some things have never really changed.

In some respects, Liang Wenfeng can be compared to Linux. The political parallel here is that the advantages of rational authoritarianism are often overlooked because of the constraints imposed by modern democratic systems. )

muyuu2 days ago
Sounds a lot like taoism, but i guess there's overlap
armanj1 day ago
I have a few lightweight apps using deepseek api, and funny how the initial credit I topped up for using r1 is still left. Nothing makes the user happier than getting more for less. cc: anthropics with its fancy token-wasting claude code "features"
aeagenticabout 14 hours ago
Not like on Openai where the credits just expire
thefounder2 days ago
They still don’t support json schema or batch api. It’s like deepseek does not want to make money
kiproping2 days ago
What do you currently use for json and batch, I was doing some analysis and my results show that gpt-oss-120b (non batch via openrotuer) is the best for now for my use case, better than gemini-flash models (batch on google). How is your experience?
aliljet2 days ago
How can you reasonably try to get near frontier (even at all tps) on hardware you own? Maybe under 5k in cost?
mordae2 days ago
Look at GB/s.

Strix halo has 256 GB/s bandwidth for $2500. The Flash model has 13 GB activations.

256 / 13 = 19.6 tokens per second

Except you cannot fit it into the maximum RAM of 128 GB Strix Halo supports. So move on.

Another option is Threadripper. That's 8 memory channels. Using older DDR4-3200 you get roughly 200 GB/s. For $2000.

200 / 13 = 15.4 tokens per second

But, a chunk of per-token weights is actually always the same and not MoE, so you would offload that to a GPU and get a decent speedup. Say 25 tokens per second total.

Then likely some expensive Mac. No idea.

Eventually you arrive at a mining rig chassis with a beefy board and multiple GPUs. That has the benefit of pipelining. You run part of the model on one GPU and move on, so another batch can start on the first one. Low (say 30-100) tps individually, but a lot more in parallel. Best get it with other people.

revolvingthrow2 days ago
For flash? 4 bit quant, 2x 96GB gpu (fast and expensive) or 1x 96GB gpu + 128GB ram (still expensive but probably usable, if you’re patient).

A mac with 256 GB memory would run it but be very slow, and so would be a 256GB ram + cheapo GPU desktop, unless you leave it running overnight.

The big model? Forget it, not this decade. You can theoretically load from SSD but waiting for the reply will be a religious experience.

Realistically the biggest models you can run on local-as-in-worth-buying-as-a-person hardware are between 120B and 200B, depending on how far you’re willing to go on quantization. Even this is fairly expensive, and that’s before RAM went to the moon.

zargon2 days ago
Flash is less than 160 GB. No need to quantize to fit in 2x 96 GB. Not sure how much context fits in 30 GB, but it should be a good amount.
redrove2 days ago
It seems to be 160GB at mixed FP4+FP8 precision, FYI. Full FP8 is 250GB+. (B)F16 at around double I would assume.
awakeasleep2 days ago
The same way you fit a bucket wheel excavator in your garage
floam2 days ago
Very carefully
zozbot2342 days ago
Run on an old HEDT platform with a lot of parallel attached storage (probably PCIe 4) and fetch weights from SSD. You'd ultimately be limited by the latency of these per-layer fetches, since MoE weights are small. You could reduce the latencies further by buying cheap Optane memory on the second-hand market.
datadrivenangel2 days ago
A loaded macbook pro can get you to the frontier from 24 months ago at ~10-40tok/s, which is plenty fast enough for regular chatting.
5424582 days ago
The low end could be something like an eBay-sourced server with a truckload of DDR3 ram doing all-cpu inference - secondhand server models with a terabyte of ram can be had for about 1.5K. The TPS will be absolute garbage and it will sound like a jet engine, but it will nominally run.

The flash version here is 284B A13B, so it might perform OK with a fairly small amount of VRAM for the active params and all regular ram for the other params, but I’d have to see benchmarks. If it turns out that works alright, an eBay server plus a 3090 might be the bang-for-buck champ for about $2.5K (assuming you’re starting from zero).

jdoe1337halo2 days ago
More like 500k
namegulf2 days ago
Is there a Quantized version of this?
mordae2 days ago
They have released mixed fp8/fp4 for efficiency. It's still hundreds of gigabytes, though. Give up on local for these.
namegulf1 day ago
That's right, need a lot of GPUs + memory, anybody experimented with Mac Studio M3 Ultra for this?
yanis_t2 days ago
Is there a harness that is as good as cloud code that can be used with open weight models?
barnabee2 days ago
I prefer OpenCode over Claude Code, and it works with basically everything. Give it a try. ymmv
npodbielski2 days ago
Never used Claude myself but there are agents that can use local model. I.e. - Jetbrains Junie - Mistral Vibe
sixhobbits2 days ago
Try pi coding agent!
Numerlor2 days ago
I've liked Hermes agent, but never used Claude code so don't know how it compares
laurentiurad2 days ago
Try Opencode or Comrade. Both OSS and working great with OSS models too.
Grp12 days ago
DeepSeek’s docs say V4 has a 1M context length. Is that actually usable in practice, or just the model/API limit?

Codex shows ~258k for me and Claude Code often shows ~200k, so I’m curious how DeepSeek is exposing such a large window.

lucrbvi2 days ago
They have added a lot of optimization focussing on the KV-cache, so they can have a much larger window without eating all the VRAM.

The 1M window might be usable, but it will probably underperform against a smaller window of course.

Advertisement
nba456_2 days ago
Wow, never seen a post with so many comments posted overnight like this.
mentos2 days ago
Me neither makes me question now if all of these comments are botted
kilroy1232 days ago
Yes and the vibe seems off to me.
GuardCalf2 days ago
I like this. The more competitors there are, the more we the users benefit.
Aldipower1 day ago
Where or how can I use this model with a DPA and better privacy terms? Are there EU friendly hosters already? Would love to use it.
dryarzeg1 day ago
> better privacy terms

DeepInfra, as far as I'm aware, doesn't log your prompts and doesn't retain them in most cases, except "debugging purposes". As their per their privacy policy[1]: "We understand that the inputs you provide to our API and the outputs it generates may contain your Personal Information. We will not store, sell, or train using this data unless we have your explicit consent. We might sometimes store, for a limited period of time, the inputs and outputs to API calls for debugging purposes."

They're not EU-based, though. And I'm not sure how "private" their inference actually is. The throughput is also not the best everywhere, sometimes it can be really slow (although right now both DeepSeek-V4 models seem to be doing fine). However, they have a good pricing, probably on of the best on the market.

I'm not affiliated with them in any way, but when I want to test (I'm not a power user of LLMs, chatbots and agents, not at all; I'm doing it just out of the curiosity) something that is too big for my local hardware, DeepInfra is usually being my go-to provider.

[1] https://deepinfra.com/privacy

bandrami2 days ago
I don't mind that High Flyer completely ripped off Anthropic to do this so much as I mind that they very obviously waited long enough for the GAB to add several dozen xz-level easter eggs to it.
cedws2 days ago
He who is a ripper off-er cannot be ripped off.
KaoruAoiShiho2 days ago
SOTA MRCR (or would've been a few hours earlier... beaten by 5.5), I've long thought of this as the most important non-agentic benchmark, so this is especially impressive. Beats Opus 4.7 here
Oxlamarr2 days ago
The speed of progress here is wild. It feels like the hard part is shifting from having access to a strong model to actually building trustworthy systems around it.
biglyburrito1 day ago
I wonder how long it will take China to respond to the release of Mythos and what their response will look like.
steveharing11 day ago
This one seems really impressive acc to bench scores but for me GLM 5.1 is still on top of every other open model so far
carrja991 day ago
Heh, my eighty year old neighbor uses DeepSeek. Everytime we catch up she tells me about all the new uses she has for it.
ksymph1 day ago
Same with my parents! It's the only one they use. I think the simple and stable web interface goes a long way; the ChatGPT site (for example) bombards you with popups, new buttons, and opaque daily limits, while DeepSeek's is pretty consistent and straightforward.

DeepSeek also tends to follow prompts more closely IME, plus the thinking is shown, so I think it's able to register as a 'tool' more easily for the non-tech-inclined for whom that appeals.

dannyw2 days ago
Are there better providers for inferencing this right now? I know it's launch day, but openrouter showing 30tps isn't looking great.
Advertisement
DennisP2 days ago
No CUDA, 1.6T parameters but with 49B active...does that mean you can run it efficiently on a 64GB macbook?
segmondy2 days ago
no, you need as much ram as the total model. But it means you can load the most important tensors in a smaller GPU. So you can run it on a PC with say 2 32gb rtx 5090 and 1tb+ of system ram.
leodavi2 days ago
Probably not. The active parameter set may change from token to token, based on my understanding of MoE, so you'd be streaming (at the worst case, unlikely for a real scenario but frames the problem) 49B parameters from SSD for every output token...
reenorap2 days ago
Which version fits in a Mac Studio M3 Ultra 512 GB?
simonw2 days ago
The Flash one should - it's 160GB on Hugging Face: https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash/tree/ma...
ycui19862 days ago
So, dual RTX PRO 6000
swrrt2 days ago
Any visualised benchmark/scoreboard for comparison between latest models? DeepSeek v4 and GPT-5.5 seems to be ground breaking.
apexalpha2 days ago
This FLash model might be affordable for OpenClaw. I run it on my mac 48gb ram now but it's slowish.
flyingsquirrel_1 day ago
Which one is better DeepSeek v4 or GLM 5.1 or opus 4.7 or gpt 5.5?
mariopt2 days ago
Does deepseek has any coding plan?
jeffzys82 days ago
no
coolThingsFirst2 days ago
I got an API key without credit card details I didn’t know they had a free plan.
JonChesterfield2 days ago
Anyone worked out how much hardware one needs to self host this one?
nstj1 day ago
Let's also not forget SoTA models stole from us.
cztomsik2 days ago
So is this the first AI lab using MUON for their frontier model?
hodgehog112 days ago
No, Muon was developed by Moonshot; they've been using it in their Kimi models since Kimi K2 in 2025.
cztomsik2 days ago
Jordan Keller worked at Moonshot? Or am I missing something? I thought he is the original author. https://x.com/kellerjordan0/status/1842300916864844014
hodgehog111 day ago
I was wondering whether someone would bring this up :-).

Yes, you're absolutely right, and no, Jordan Keller does not work for Moonshot. He is the original author of the algorithm, so credit goes to him.

There's a lot of legwork to go from prototyping to proper development though. The reason I said what I did is because Moonshot has the first research publication on it that I'm aware of. Could definitely have used better language though, my apologies to Jordan!

Advertisement
casey22 days ago
Already over a billion tokens on open router in under 5 hours
gigatexal2 days ago
Has anyone used it? How does it compare to gpt 5.5 or opus 4.7?
periodjet1 day ago
Very exciting. Amazing work. The CCP shilling on this board has reached epidemic proportions though, and is shocking to witness.
tcbrah2 days ago
giving meta a run for its money, esp when it was supposed to be the poster child for OSS models. deepseek is really overshadowing them rn
alpineman2 days ago
Meta is totally directionless
8note1 day ago
so why is a model release just a politics thread?

is this not cool tech, available for use?

i look forward to seeing what gets made on top of deepseek 4, more than what it means for US politics.

especially with how open deepseek is with its advancements, im excited to see how they get applied into sota western models

fbrncci2 days ago
Take that Anthropic and your shenanigans.
cl082 days ago
Any way to connect this to claude code?
mordae2 days ago
It's literally in the linked docs.
kittikitti1 day ago
This is a great model from DeepSeek and I look forward to seeing the developments from this. I am also very frustrated that American states, corporations, and organizations have banned DeepSeek models or made them illegal. It considerably restricts my AI operations and the ability to conduct research and development. As someone who hosts open-source models with compute resources available to serve DeepSeek V4, it brings considerable risk just because I am in America.

I hope that DeepSeek wins the AI race or at least gets ahead to the point where it becomes infeasible for bans and regulations against it. It's ridiculous that American legislators are advocating for less regulations for DeepSeek except for their own racist ideas about which AI should be approved or not.

ascii0eks84about 21 hours ago
Someone did a simple "Count 10 starting from 11" and it got stuck.
cubefox2 days ago
Abstract of the technical report [1]:

> We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens. DeepSeek-V4 series incorporate several key upgrades in architecture and optimization: (1) a hybrid attention architecture that combines Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) to improve long-context efficiency; (2) Manifold-Constrained Hyper-Connections (mHC) that enhance conventional residual connections; (3) and the Muon optimizer for faster convergence and greater training stability. We pre-train both models on more than 32T diverse and high-quality tokens, followed by a comprehensive post-training pipeline that unlocks and further enhances their capabilities. DeepSeek-V4-Pro-Max, the maximum reasoning effort mode of DeepSeek-V4-Pro, redefines the state-of-the-art for open models, outperforming its predecessors in core tasks. Meanwhile, DeepSeek-V4 series are highly efficient in long-context scenarios. In the one-million-token context setting, DeepSeek-V4-Pro requires only 27% of single-token inference FLOPs and 10% of KV cache compared with DeepSeek-V3.2. This enables us to routinely support one-million-token contexts, thereby making long-horizon tasks and further test-time scaling more feasible. The model checkpoints are available at https://huggingface.co/collections/deepseek-ai/deepseek-v4.

1: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

Advertisement
tariky2 days ago
Anyone tried with make web UI with it? How good is it? For me opus is only worth because of it.
augment_me2 days ago
Amaze amaze amaze
neuroelectron1 day ago
Shouldn't there be a hyper context view model context protocol standard?
zurfer2 days ago
lots of great stuff, but the plot in the paper is just chart crime. different shades of gray for references where sometimes you see 4 models and sometimes 3.
dackdel1 day ago
god bless deepseek
ghstinda2 days ago
so many models not enough time
luew2 days ago
We will be hosting it soon at getlilac.com!
punkpeye2 days ago
Incredible model quality to price ratio
npv7891 day ago
my current default model now, bye gpt 5.5
Rover2221 day ago
Quite jarring to see how many people think the Chinese authoritarian regime, and the tech that it allows to be created in that country, are going to be "safer" or whatever than US tech.

It's trendy to say the US govt is now authorization, but that's just pure naïve groupthink.

mike_hearn1 day ago
It's just the anti-Americanism that has typified the Euroleft for decades. You can find people complaining about it back in the 1800s. As can be seen by how much American product Europe consumes it's not actually an influential mode of thought, just a form of ingroup signalling, so it can largely be ignored.
Rover222about 20 hours ago
But it's now mainstream thought on the left in America.
tehjoker1 day ago
American security services can touch Americans, Chinese ones can't. That's even assuming the worst about China, which I don't think is appropriate.
Advertisement
hongbo_zhang2 days ago
congrats
donbreo2 days ago
Aaaand it cant still name all the states in India,or say what happened in 1989
mordae2 days ago
Ask Claude how to overthrow a Nazi dictatorship in the US.
inspector142 days ago
easy, you buy twitter and let people speak freely again
gn_central1 day ago
How does this actually perform in real-world usage? Benchmarks look strong, but I’m curious about latency and stability.
dhruv30062 days ago
Ah now !
howmayiannoyyou1 day ago
More fawning over Chinese models without any mention of data privacy, or how this AI may someday be used to undermine US national or economic security. HN is hopelessly compromised by anti-American sentiment.
shafiemoji2 days ago
I hope the update is an improvement. Losing 3.2 would be a real loss, it's excellent.
sheeshkebab2 days ago
Ask it if there was a Tiananmen square massacre. Then decide if you really want to be part of this murderous propaganda.
segmondy2 days ago
I bet you don't use any Chinese made product. Everything you own was not made in China. Please reply and let us know.
raincole2 days ago
History doesn't always repeat itself.

But if it does, then in the following week we'll see DeepSeek4 floods every AI-related online space. Thousands of posts swearing how it's better than the latest models OpenAI/Anthropic/Google have but only costs pennies.

Then a few weeks later it'll be forgotten by most.

sbysb2 days ago
It's difficult because even if the underlying model is very good, not having a pre-built harness like Claude Code makes it very un-sticky for most devs. Even at equal quality, the friction (or at least perceived friction) is higher than the mainstream models.
raincole2 days ago
OpenCode? Pi?

If one finds it difficult to set up OpenCode to use whatever providers they want, I won't call them 'dev'.

The only real friction (if the model is actually as good as SOTA) is to convince your employer to pay for it. But again if it really provides the same value at a fraction of the cost, it'll eventually cease to be an issue.

throwa3562622 days ago

    "If one finds it difficult to set up OpenCode to use whatever providers they want, I won't call them 'dev'."

I feel the same way. But look at the ollama vs llama.cpp post from HN few days back and you will see most of the enthusiasts in this space are very non technical people.
cmrdporcupine2 days ago
They have instructions right on their page on how to use claude code with it.
2ndorderthought2 days ago
You can literally run it from Claude code. Easily too