ZH version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
68% Positive
Analyzed from 2064 words in the discussion.
Trending Topics
#openrouter#deepseek#model#data#https#tencent#provider#don#tokens#api

Discussion (111 Comments)Read Original on HackerNews
(Transcript: https://gist.github.com/simonw/c2a0d8ecd3056a2681319eae8fc3f...)
What do we think we are doing with this life?
(But maybe that's just my interpretation based on something else going wrong in the animation)
Bit like asking for CSS and then getting a HTML file back with the CSS embedded, that was not what I was asking for!
https://github.com/lechmazur/buyout_game 10th out 36.
https://github.com/lechmazur/pact/ 14th out 25.
https://github.com/lechmazur/nyt-connections/ 60th out 81.
https://github.com/lechmazur/debate 16th out of 29.
Just curious, can you share what are those hardest puzzles that even the top models can't crack? sometimes when I find the puzzle absolutely undecipherable I like to ask LLMs to solve it, and I haven't seen them fail yet.
Is there a reason you change the leaderboard graphs for the third and fourth one?
Also: would be great to have an overview page with a summary over all test, like a total score or similar.
Time for a reminder that OpenRouter leaderboards only show tokens sent through OpenRouter, which most Anthropic API users don’t use.
The list of apps using Hy3 Preview shows Hermes Agent causing 65% usage over the last 3 weeks https://openrouter.ai/tencent/hy3-preview/apps
Which means if a surprise model tops the leaderboard one week we can never be sure if it was because a single whale user pushing billions of tokens a day switched to it, or if it represents a genuine community trend towards that model.
Yeah we should do something to indicate cardinality. I can share that there can often (I'm talking generally; not related to this model in particular) be e.g. a very large app that can be pushing a lot of volume. But in almost all cases that app has a large number of end users. Hypothetically, for instance, would Cursor be consider one user, or millions?
Will think about it! Thanks for the feedback.
If you treated Cursor as millions of users it might look like millions of people independently chose a new model when actually it was Cursor making the choice for them - and the thing I care most about is how many choices were made that selected a model and put it above the others.
In the Cursor case which is BYOK, that would count as distinct API keys.
Thanks!
Down with reality!!
You might have the default settings on your account, which limit Deepseek as a provider. If you disable that feature you see them on openrouter as well (and they serve it at the same cost as their own API).
However, I just double checked, and OpenRouter's pricing page for Flash v4 with DeepSeek provider shows a cache hit rate of $0.0028, which is the same as on DeepSeek's official API pricing page ($0.0028), so they do seem to be the same price, (assuming DeepSeek is able to pin your specific OpenRouter requests to the same DeepSeek server). OpenRouter adds 5% to that cost, but still it might be cheaper than the other providers.
Also just found out OpenRouter has a new feature "Response Caching" where they can cache identical requests and return them immediately with no billing. The entire request must be identical, though, not just a prefix, and you have to enable this feature. I don't know who would need to send multiple identical requests, but it's better than nothing?
You're trying to think logically, which has no place in an AI discussion. :) People just jump to whatever the latest model is. Plenty of people also prefer price to "quality" (which is very subjective). It's new, it's cheap, so people use it. It's likely people will stop using it when something else is cheaper and/or newer.
Then I read somewhere (I think X) that OpenRouter adds stuff and breaks caching (telemetry? headers? can't remember). So I stopped the job, switched to actual DeepSeek provider, and voilá, caching 3x more tokens per request (on average).
Directly: 135M input tokens - $0.57 (134M cached)
Via OpenRouter 6M tokens - $0.81 (caching stats & inp/out not reported)
Caching is a huge win with using deepseek directly.
“Independent open-source project · not affiliated with DeepSeek”
Training on ~1B tokens on 8xB300 and the first checkpoint halfway in learned really well. Tencent might be struggling with agentic work, but the base knowledge is there.
The numbers at the beginning of the post are weekly aggregate values well after the endpoint was paid-only.
> Hy3 preview is no longer available as a free model. It has transitioned to a paid model. Continue using it here: https://openrouter.ai/tencent/hy3-preview
The Kilo Code may have free traffic but if you check the numbers is still inconsequential relative to the trillions of tokens through OpenRouter.
https://www.mdshare.online/s/uend0pj3og_A_rgcxzINf
https://hy.tencent.com/research/hy3
I mean sure there are investors and a little more open-ness, but with the example of Mythos we don't even know if public will get access to the "good" stuff because it's too dangerous.
If your only opinion on trusting these companies more than one based in China is, they are Chinese then good luck, all the best.
Your business data is probably worthless, even considered harmful for the pretrain corpus.
Your interactions and decision making process are most valuable parts of the whole business.
please tell me you are not in charge of the data of any business I'm a client of
What you need to know is who is the provider for the LLM, and whether their endpoints are zero data retention enabled and opted out of training. OpenRouter gives you an easy way to control this.
Its of course highly dependant on the use case and the environment, but simply saying that the only important part is to know where the data goes is too simple.
It's the same way we trust OpenAI to not train on our data if we've opted out although there is no control on whether they can retain the data indefinitely.