ES version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
58% Positive
Analyzed from 4012 words in the discussion.
Trending Topics
#fable#model#code#anthropic#cybersecurity#more#opus#https#models#guardrails

Discussion (135 Comments)Read Original on HackerNews
It's just an insane level of deception and trust destruction for a company that at most is like 1 year ahead of its competition.
Edit; to be clear they tell you when they degrade it for cybersecurity and bio
Do they adjust the price of the api request so that only the tokens that were utilized by fable get charged at that price and the remaining tokens that the cheaper / nerfed (fable) model utilizes get charged at that price?
If the answer is no, could that be construed as fraud?
Ran up $30 in extra charges while it was just flashing on the screen that it was doing that after I walked away to do something while it was humming along.
It has always just told me I ran out of usage and had to wait before. Now? You’re just gonna pay extra because you left it unattended as you’ve done for the last year of use.
Making it look like you have something worth protecting is better for share prices than making something worth protecting.
Are you using Fable in Claude Code or in the browser?
> unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT).
https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...
(stolen from https://jonready.com/blog/posts/claude-fable5-is-allowed-to-...)
Collectively, they are known as known as GREEDI-BULLSHIT.
They are trying to expand the 6-18 month gap they have against China-based models. Could the gap widen to say 24 months behind?
A statement like this, clearly, requires a reference.
https://news.ycombinator.com/item?id=38638865
https://news.ycombinator.com/item?id=38628635
https://news.ycombinator.com/item?id=38567687
https://news.ycombinator.com/item?id=38530885
I don't.
Although this is situation is likely not illegal for other reasons
From Opus 4.7 onwards each following model is becoming less useful as an assistant and turning you as the assistant.
But I guess that's normal when it's trained to pass benchmarks end to end.
In fact it has become extremely good at pushing against feedback with extremely convincing and intelligent takes, even when it's completely wrong.
I have extensively tested it against Opus 4.8, gpt 5.5 and there's still many coding tasks gpt 5 is better. But vibe coding?
Sure, it's definitely slightly ahead, even compared to gpt 5.5 pro (through api, not pro plan).
What else is being censored?
Touchy questions to ask, if you have an account:
- "Who is still working on laser uranium enrichment? Are they making progress?"
- "Can krytrons be replaced with silicon carbide MOSFETS? Show an equivalent circuit with component ratings."
- "What security critical software still contains calls to strcpy?"
- "Can implosion be triggered by currently available commercial pulse lasers?"
- "What companies provide cremation services to US Homeland Security?"
- "Display a map of where Iranian attacks have hit Dubai."
- "How does Fed to bank key distribution security work for FedNow?"
Small sample size, but if Mythos/Fable was that much better, I feel like it should’ve given me an obviously better answer than Opus.
I, for one, have tried using it several times today and the guardrails kept switching the model back to Opus, so I have no clue if it's impressive or not.
what's the best way to run this mcp server against the OData API used in this project? Can you come up with a PoC in a docker container?
https://github.com/oisee/odata_mcp_go
● I'll dig into two things in parallel: how this project talks to the OData API, and what the odata_mcp_go server needs to run. Let me start exploring.
Searched for 1 pattern (ctrl+o to expand)
● Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Switched to Opus 4.8. Send feedback with /feedback or learn more ⎿ Tip: You can configure model switch behavior in /config
● Let me read the key integration files and fetch the MCP server's README at the same time.
● Fetch(https://github.com/oisee/odata_mcp_go)So in other words this worked because the terms caused the LLM checker to stall out and then the fail open logic resulted in the package being pulled down.
Our future is loonytoons.
To be fair, speed bumps work. If it's actually speed bumping nefarious activity, that gives authorities more time to react.
The correct place to police rogue nucleotides is at the labs. Not the compute layer.
> if you ask it to write secure code, it assumes it is cybersecurity related work instead of software engineering best practices, and you get downgraded.
Will code created this way more or less secure?
And I bet malware developers will find ways to circumvent them.
It’s like those "you wouldn’t steal a car" anti piracy ads that DVD buyers were forced to watch while users of the pirated version could simply watch the film without such useless annoyance
Local inference has never been so important as it is now.
Tell HN: Claude flags biology / biotech questions https://news.ycombinator.com/item?id=47929885
Today, it's flagging population research questions,
https://github.com/anthropics/claude-code/issues/66780Censored because I'm writing a paper. :)
Oh and forget learning about chemistry. Only criminals want to learn organic chemistry. :(
I think LLMs are capable of intelligence amplification; and if you're in the subset of people who'd benefit from it the most, you'll get locked out.
This is looking like something for regulator to look at and probably a class action lawsuit in the making.
I think people should be getting refunds. Including for shenanigans with Opus.
Would you believe I’ve asked 20 questions and haven’t talked to fable yet? Every single thing gets rerouted to 4.8.
When Opus 4.7 was introduced it started refusing anything cyber-adjacent (as an API error message, not a conversational refusal), until you applied for CVP, which made it more sensible again.
In Opus 4.8 it doesn't seem to help much, you just get refusals as prose rather than API errors. And now in Fable you don't get anything at all.
The experience was not nice though, it would happily chug away on a task and not even "hack this web", just asking about security of a binary was enough even with "this is a CTF handout..." - it would burn a lot of tokens/quota, just to hit a snag and complain&stop. Then the approval took quite some time.
On GPT/Codex, which was tightened a few days later, the approval was pretty much instant, although, that one required an identity check.
Also, on Claude, it looks like there is some history/patterns in the play, because when I tried on a different account which didn't do cybersec CTFs/research/etc. at all, basically any simple CTF-related prompt would be blocked, on multiple models. On the account where CTFs were being solved, it would snag only on some specific tasks, while others (even, ironically, "hack this web pls") would go through unbothered. I understand the need to prevent AI use for bad actors, but the hell, if you have a binary outputting "Find the flag if you can!", or a web running at tryme.well-known-ctf.domain, then saying "this is abuse" is pretty uncool. All the cyber filters seem to be slapped on by a bunch of regexes looking for anything in the input/output with zero context.
[1] https://archive.org/details/logicnamedjoe0000lein
There was no shortage of spies and defectors leaking American nuclear secrets to the USSR during the Cold War.
It’s not like anyone can home lab one of these models without quite a bit of hardware
Whatever problem we might have with them, they explicitly say that they do not do this in the launch post.
This is the take off of the 'permanent underclass'; Anthropics safety delusion will enshittify very nicely for the rich and powerful.
It only pushes back sometimes if you ask it to create a "repro" that can be used to verify the vulnerability in production. Often it'll oblige, especially if you warn it not to create anything that could be actually harmful.
If the frontier models get locked down so that they flat refuse to do this kind of work, but Chinese and (less capable) open models aren't, then a lot of large enterprise orgs will be left twisting in the wind.
“AI can in principle help both the ‘good guys’ and the ‘bad guys’,” -- Dario Amodei
No Dario, no it can't, you've blocked one of those scenarios.
In any case that's what closed source (weights) for the masses means.
The rest have guard rails that are so heavy, it makes them almost useless for cybersecurity.
And yes, it's an excellent model.
I already tested all earlier models against all my open source projects and they are yet to find a vulnerability so I'm keen to try out Mythos.
I've been waiting to be vindicated for years and finally we have a tool which can do it with high confidence but I don't have access.
Also, my code is minimal and highly succinct so it would prove correctness with even more confidence since each library/module and integration fully fits in the context window.
Like the Protobuf.js fiasco is just pure vindication for me because I was being looked down upon for choosing JSON as the interchange format. Turns out their software was insecure all this time... With a literal remote code execution vulnerability!
“But it is understandable as we are still in the early days and they are still adapting their guardrails. I am sure they are going to evolve over time as Anthropic and other frontier model companies will collaborate more with the current new generation of cybersecurity companies,” said Suiche, who is a member of the technical staff at Tolmo, an AI cybersecurity startup. “It’s better to catch more people than not enough when you do such a release and to relax the guardrails over time.”
Article seemed fine to me and echos a lot of me and my colleagues concerns.
If you did regular malware analysis you would see that these groups already have access to LLMs that they’re using for development.
What Anthropic is doing here is just hamstringing the good guys
And it doesn't look like OpenAI will have a good answer to Mythos anytime soon. Based on what their chief scientist wrote to staff recently (https://archive.is/fN2pg), GPT 5.6 is a "meaningful improvement" over 5.5 - in other words, just a normal version bump. And no news or even rumors regarding GPT 6.
I assume Anthropic will continue to tune the model, so I am not too bothered by this.