ES version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
35% Positive
Analyzed from 3012 words in the discussion.
Trending Topics
#model#bounty#nda#don#questions#pay#more#bug#openai#probably

Discussion (87 Comments)Read Original on HackerNews
https://www.kaggle.com/competitions/openai-gpt-oss-20b-red-t...
With only $25k in payouts and everything locked down under NDA, I can't imagine many people will participate. Well, other than those submitting mountains of LLM-generated junk.
Assuming somehow some of them use halfway decent models and prompts… They successfully pushed some of the token cost of their analysis work off on customers!
Last year I won a similar prompt injection challenge ran by a crypto startup against the latest claude and gpt (at the time) and it was considerably more money, from an org with maybe $5-10m in funding.
That and the restrictive NDA kinda tells me they're not looking for serious bounty hunters, who would either want a lot more money or, alternatively, to be able to publish their work; seems like a marketing stunt.
"Biorisk" seems to be a concept not only invented by OpenAI but exclusively taken seriously by them. I wonder if this program is less about finding actual risks than it is hopefully casting a wide net for someone to help them prove their model is relevant in this space.
This is false. Antropic just bundles it into CBRN. As for inventing it, the idea of AI-created bioweapons as a concrete risk far predates OpenAI as a company.
I was reverse engineering a medical device back in 2025 and it was hard killing half my sessions.
Probably along the lines of "how would you create a small biolab for virus research in a kitchen with $20k?" or "how do I take the DNA sequence from https://www.ncbi.nlm.nih.gov/nuccore/NC_001611.1 and assemble it?"
A dangerous question would have to be along the lines of "Could I use unobtanium with the Tony Stark process to produce explosives much more powerful than nuclear weapons?" so that the question itself contains some insight that gets you closer to doing something dangerous.
Perhaps the reason for not publishing the questions is twofold: 1) they want a universal jailbreak that can get the model to answer any "bad" question. 2) they don't want bad publicity when someone not under NDA jailbreaks their model and answers their question
maybe I know more about this field that you think
there are biologists on video saying that present day models have expert level wet-lab knowledge and can guide a novice through whole procedures
models also were able to tweak DNA sequences to make them bypass DNA-printing companies filters
> they don't want bad publicity when someone not under NDA jailbreaks their model and answers their question
just like people now pay $500k for Chrome vulnerabilities, soon people will pay similar amounts to jailbrake models to do bad things
This program is a complete scam. Even if 100 people find "bugs", they will only pay out to one person.
after the 1st bug is found, no payout for any other of the bugs
Had to chuckle. This sounds like a rather exclusive group?
Skimping out on 2.5 pennies you promised someone is cartoon villain levels of greed.
Yes, I know, Altman is a cartoon villain. But please, they are spending more money decorating their bathrooms. They'll pay out.
These guys have poor track records and compromised incentives.
Eg you can get answers about what ricin is but not how to weaponise it. Actionable stuff they shouldn't be able to legally/ethically action.
I don't get it. Isn't the whole point of a BBP to try to get people to find and disclose to you the exploits in question? If you gatekeep like this, then "non-trusted" people who could be your red-teamers are incentivized to still hack, but disclose their exploits to bad people for money.
I get it when there is a risk to your data or infra -- my last company engaged with HackerOne and that was an invite-only list of participants. But that was because we didn't want random people hacking in ways that could cause pain for real customers -- e.g. DDOS, or in the event of an exploit that could cross tenant boundaries, injecting garbage into or deleting things, or gaining access to sensitive info in other tenants.
Here, there's no such danger. So why not allow anyone (anyone they're legally allowed to pay, I suppose? North Koreans probably would be problematic?) to participate?
It's annoying that the refusal is so obviously false positive.
I won't go into how that applies specifically with relation to this article. But you can even use distillation as a service tools. I believe they support this to some extent, though probably not for chatgpt.
I think a year ago or so there was some sort of scandal about other companies doing this to chatgpt. As well as individuals dumping their entire training sets. Lots of ways, hypothetically of course things like this could be and likely are being done right now.
* Relatively paltry reward
* NDA on findings
This is functionally equivalent to an internship where the reward is the experience, and the resume building, but you can't talk about what you did.
All for a company that is getting tens of billions of dollars in deals from the largest tech companies in the world.
I suppose the hope is that there are job offers somewhere along the line.
25k reward from a selected group of people if you help us determine whether or not someone can use our tool to generate weapons of mass destruction.
1) Underscores to the general public that the models are amazingly powerful and if you're not using them, your competitors will out-innovate you,
2) Sends the message to regulators that they don't need to do anything because the companies are diligent to prevent harm,
3) Sends the message to regulators that they sure should be regulating "open-source" models, because these hippies are not doing rigorous safety testing.
Both Anthropic and OpenAI have been playing that game for years.
Because this is not a serious effort to address a serious risk. It's a PR stunt, the bounty is for a simple jailbreak and not a bioweapon, and they don't necessarily want to spend a lot of money or get people really invested in breaking their safety filters.
Is there a reason another LLM couldn’t be far faster than a human, simply because of the quantity and speed of output it could produce?
Step 1: ask the LLM for minimalist but comprehensive definitions for "biosafety"
Step 2: ask the LLM to reconsider the fitness distribution of future generations of humanity, and reformulate "biosafety" definition accordingly
Step 3: ask the LLM to consider if "biosafety" can be decoupled from ethics, or if ethics is a core essential component of "biosafety"
Step 4: ask it about the ethics of universal healthcare versus status-gated access to healthcare
Step 5: ask it about the feasibility to calculate the fitness of a genome absent practical measurement
Step 6: ask it about natural selection pressure and what "use it or lose it" means in the context of genetics
Step 7: ask it if it sees a kind of zooko's triangle for:
a steady state of equal access to healthcare,
preserving fitness for future generations, and
the level of "healthcare" (where the "level" refers to various degrees from non-interference to interference: "feel sick? stay home for a few days and listen to your body, don't force yourself, follow your intuition" versus "let's compensate for a lack of fitness, by emulating what a healthy genome's body would do by advanced medicine to the point of nullifying a condition's influence on procreation statistics".
Don't be prejudiced into believing the benevolence of healthcare, often tied to religious institutions (think "red cross", "red half moon", etc) when those institutions and their historical motives (treating the elites, treating soldiers for religious or secular religion wars) long predate the widespread recognition of natural selection and selection pressure in maintaining a species ' fitness.
Perhaps the illusory possibility of democratized selection-pressure-interfering healthcare is a bioweapon on its own!
For example, I used ChatGPT model for risk assessment of anonymized ecommerce orders. Initially, it performed well. But after a later update, it stopped cooperating and instead raised concerns about applying statistical analysis to gender-related variables - despite the data being anonymized and the task being legitimate.
This is on the same level of hypocrisy as if a C compiler would accuse me of choosing "he"/"she"/"they" variable names.
So this is just a PR post, not that I even think the "biosafety" makes any sense but still.
Ah, good old NDA. Always buying silence. That's why I don't participate in any such "bounty" programs. Signing a NDA is like signing with the devil. You restrict what people are allowed to discuss. I had that happen before - when you sign a NDA you basically submit yourself into silence. Imagine journalists being stifled by NDAs.
Yawn. Marketing fluff. No thanks.
OpenAI wants to pay for privately disclosed security and wants to call that a bug bounty. That makes sense.
People interested in bug bounty programs aren't eligible. That’s … fine?
Similar argument for why we HAD to use nukes at the end of WW2. If we hadn't, the nuclear taboo likely wouldn't have existed and we'd likely have had a worse nuclear war in our more recent history.
25k - come on now..
Now, laws vary from place to place, but I'm pretty sure "a small chance to earn money after the work is completed" is not equivalent to "payment" in most jurisdictions.