AI assistance when contributing to the Linux kernel

512

hhmokiguess 3 days ago 410 commentsRead Article on github.com

DE version is available. Content is displayed in original English for accuracy.

⚡ Community Insights

Discussion Sentiment

64% Positive

Analyzed from 8318 words in the discussion.

Discussion (410 Comments)Read Original on HackerNews

qsort•3 days ago

Basically the rules are that you can use AI, but you take full responsibility for your commits and code must satisfy the license.

That's... refreshingly normal? Surely something most people acting in good faith can get behind.

pibaker•2 days ago

I agree this is very sane and boring. What is insane is that they have to state this in the first place.

I am not against AI coding in general. But there are too many people "contributing" AI generated code to open source projects even when they can't understand what's going on in their code just so they can say in their resumes that they contributed to a big open source project once. And when the maintainer call them out they just blame it on the AI coding tools they are using as if they are not opening PRs under their own names. I can't blame any open source maintainer for being at least a little sceptical when it comes to AI generated contributions.

theptip•2 days ago

I think them stating this very simple policy should also be read as them explicitly not making a more restrictive policy, as some kernel maintainers were proposing.

Applejinx•2 days ago

From everything I'm seeing in the industry (I'm basically a noncoder choosing to not use AI in the stuff that I make, and privy to the private work experience of coders and creators also in that field because of human social contacts), I feel like I can shed a bit of light.

It looks to me like a more restrictive policy will be flat-out impossible.

Even people I trust are going along with this stuff, akin to CAD replacing drafting. Code is logic as language, and starting with web code and rapidly metastasizing to C++ (due to complexity and the sheer size of the extant codebase, good and bad) the AI has turned slop-coding to a 'solved problem'. If you don't mean to do the best possible thing or a new thing there is no excuse for existing as a coder in the world of AI.

If you do expect to do a new thing or a best thing, in theory you're required to put out the novel information as AI cannot reach it until you've entered it into the corpus of existing code the AI's built on. However, if you're simply recombining existing aspects of the code language in a novel way, that might be more reachable… that's probably where 'AI escape velocity' will come from should it occur.

In practice, everybody I know is relegating the busywork of coding to AI. I don't feel social pressure to do the same but I'm not a coder. I'm something else that produces MIT-licensed codebases for accomplishing things that aren't represented in code AS code, rather it's for accomplishing things that are specific and experiential. I write code to make specific noises I'm not hearing elsewhere, and not hearing out of the mainstream of 'sound-making code artifacts'.

Therefore, it's impractical for Linux to take any position forbidding AI-assisted code. People will just lie and claim they did it. Is primitive tab-complete also AI? Where's the line? What about when coding tools uniformly begin to tab-complete with extensive reasoning and code prototyping? I already see this in the JetBrains Rider editor I use for Godot hacking, even though I've turned off everything I can related to AI. It'll still try to tab-complete patterns it thinks it recognizes, rarely with what I intend.

And so the choice is to enforce responsibility. I think this is appropriate because that's where the choices will matter. Additions and alterations will be the responsibility of specific human people, which won't handle everything negative that's happening but will allow for some pressures and expectations that are useful.

I don't think you can be a collaborative software project right now and not deal with this in some way. I get out of it because I'm read-only: I'm writing stuff on a codebase that lives on an antique laptop without internet access that couldn't run AI if it tried. Very likely the only web browsers it can run are similarly unable to handle 2026 web pages, though I've not checked in years. You've only got my word for that, though, and your estimation of my veracity based on how plausible it seems (I code publically on livestreams, and am not at all an impressive coder when I do that). Linux can't do what I do, so it's going to do what Linux does, and this seems the best option.

matheusmoreira•2 days ago

On the other hand, it seriously sucks to spend time learning a big codebase and modifying it with care, only to not be given the time of day when you send the patches to the maintainers. Sometimes the reward for this human labor isn't a sincere peer review of the work and a productive back-and-forth to iron out issues before merging, it's to watch one's work languish unnoticed for a long time only for the maintainer to show up after the fact and write his own fix or implementation while giving you a shout out in the commit message if you're lucky.

Can't really blame people for reducing their level of effort. It's very easy to put in a lot of effort and end up with absolutely nothing to show for it. Before AI came along, my realization was that begging the maintainers to implement the features I wanted was the right move. They have all the context and can do it better than us in a fraction of the time it'd take us to do it. Actually cloning someone else's repository and working on it should only be attempted if one is willing to literally fork it and own the project should things go south. Now that we have AI, it's actually possible to easily understand and modify complex codebases, and I simply cannot find the will to blame people for using it to the fullest extent. Getting the AI to maintain the fork is really easy too.

jlarocco•1 day ago

> I agree this is very sane and boring. What is insane is that they have to state this in the first place.

I don't think it's insane. It seems reasonable that people could disagree about how much attribution and disclosure there should be about AI assistance, or if it's even allowed, etc.

Every document in that `process` directory explains stuff that could be obvious to some people but not others.

cat_plus_plus•1 day ago

That's a dim view, people also contribute to make projects work for their own needs with hopes to share fixes with others. Like if I make a fix to vLLM to make a model load on particular hardware, I can verify functionality (LLM no longer strays off topic) and local plausibility (global scales are being applied to attention layers), but I can't pretend to understand full math of the overall process and will never have enough time to do so. So, I can be upfront about AI assist and then maintainer can choose to double check, or else if they don't have time, I guess I can just post a PR link on model's huggingface page and tell others with same hardware they can try to cherrypick it.

What's missed is that neither contributors nor maintainers are usually paid for their effort and nobody has standing to demand that they do anything they are not doing already. Don't like a messy vibe coded PR but need functionality? Then clean it up yourself and send improved version for review. Or let it be unmerged. But don't assign work to others you don't employ.

On the other hand, companies like NVIDIA should be publicly taken to task for changing their mind about instruction set for every new GPU and then not supporting them properly in popular inference engines, they certainly have enough money to hire people who will learn vLLM inside out and ensure high quality patches.

lrvick•2 days ago

It cannot be understated how religiously opposed many in the Linux community are to even a single AI assisted commit landing in the kernel no matter how well reviewed.

Plenty see Torvalds as a traitor for this policy and will never contribute again if any clearly labeled AI generated code is actually allowed to merge.

cinntaile•2 days ago

Some people are just against change, that's nothing new. If Linus was like them, he would never have started linux in the first place.

sdevonoes•2 days ago

Not every change is good, and sometimes we realise too late

goatlover•2 days ago

Are they against change in general, or certain kinds of change? Remember when social media was seen as near universal good kind of progress? Not so much now.

Luker88•2 days ago

Just remember that "reviewed" is not enough to not be considered public domain.

It needs to be modified by a human. No amount of prompting counts, and you can only copyright the modified parts.

Any license on "100% vibecoded" projects can be safely ignored.

I expect litigations in a few years where people argue about how much they can steal and relicense "since it was vibecoded anyway".

shakna•2 days ago

For those who might wonder how accurate this is, there is advice from the Federal Register to this effect. [0] Its quite comprehensive, and covers pretty much every question that might be asked about "What about...?"

> In these cases, copyright will only protect the human-authored aspects of the work, which are “independent of” and do “not affect” the copyright status of the AI-generated material itself.

[0] https://www.federalregister.gov/documents/2023/03/16/2023-05...

lrvick•2 days ago

Meanwhile I expect that intellectual property protections for software are completely unenforceable and effectively useless now. If something does not exist as MIT, an LLM will create it.

The playing field is level now, and corpo moats no longer exist. I happily take that trade.

VorpalWay•2 days ago

> Any license on "100% vibecoded" projects can be safely ignored.

As far as I know that has only been decided in US so far, which is far from the whole world.

OtomotO•2 days ago

So, how are you gonna prove I didn't write some code?

How am I gonna prove I did?

alfiedotwtf•2 days ago

In what jurisdiction?!

It’s weird how people on HN state legal opinion as fact… e.g if someone in the Philippines vibecodes an app and a person in Equador vibecodes a 100% copy of the source, what now?

martin-t•2 days ago

I don't think modified by a human is enough. If you take licensed text (code or otherwise) and manually replace every word with a synonym, it does not remove the license. If you manually change every loop into a map/filter, it does not remove the license. I don't think any amount of mechanical transformation, regardless if done by a human or machine erases it.

There's a threshold where you modify it enough, it is no longer recognizable as being a modification of the original and you might get away with it, unless you confess what process you used to create it.

This is different to learning from the original and then building something equivalent from scratch using only your memory without constantly looking back and forth between your copy and the original.

This is how some companies do "clear room reimplementations" - one team looks at the original and writes a spec, another team which has never seen the original code implements an entirely standalone version.

And of course there are people who claim this can be automated now[0]. This one is satire (read the blog) but it is possible if the law is interpreted the way LLM companies work and there are reports the website works as advertised by people who were willing to spend money to test it.

[0]: https://malus.sh/

beAbU•1 day ago

It cannot be understated how religiously opposed many in the woodworking community are to even a single table saw assisted cut making it's way to a piece of furniture, no matter how well designed.

Plenty see {{some_woodworker}} as a traitor for this policy and will never contribute again if any clearly labeled table saw cuts is actually allowed to be used in furniture making.

agentultra•1 day ago

There's a stark difference between a table saw and an LLM that weakens this argument.

A table saw isn't a probabilistic device.

oompydoompy74•1 day ago

I find the strong anti AI sentiment just as annoying as the strong pro AI sentiment. I hope that the extremes can go scream in their own echo chamber soon, so that the rest of us can get back to building and talking about how to make technology useful.

Klonoar•2 days ago

Reads like a “fuck you and I’ll see you tomorrow” threat.

dxdm•2 days ago

Sounds dramatic, but it entirely depends on what "many" and "plenty" means in your comment, and who exactly is included. So far, what you wrote can be seen as an expectable level of drama surrounding such projects.

ebbi•2 days ago

True - on Mastodon there is a very vocal crowd that are against AI in general, and are identifying Linux distros that have AI generated code with the view of boycotting it.

lrvick•2 days ago

Soon they will have to boycott all of them. Then what I wonder?

positron26•1 day ago

What these hardliners are standing for, I have no idea. If the code passes review, we're just arguing about hues of zeros and ones. "AI" is an attribute that type-erases entirely once an engineer pulls out the useful expressions and whips them into shape.

The worst part about all reactionary scares is that, because the behaviors are driven by emotion and feeling as opposed to any intentional course of action, the outcomes are usually counter productive. The current AI scare is exactly what you would want if you are OpenAI. Convince OSS, not to mention "free" software people, to run around dooming and ant milling each other about "AI bad" and pretty soon OSS is a poisonous minefield for any actual open AI, so OSS as a whole just sabotages itself and is mostly out of the fight.

I'm currently in the middle of trying to blow straight past this gatekeepy outer layer of the online discourse. What is a bit frustrating is knowing that while the seed will find the niches and begin spreading through invisible channels, in the visible channels, there's going to be all kinds of knee-jerk pushback from these anti-AI hardliners who can't distinguish between local AI and paying Anthropic for a license to use a computer. Worse, they don't care. The social psychosis of being empowered against some "others" is more important. Either that or they are bots.

And all of this is on top of what I've been saying for over a year. VRAM efficiency will kill the datacenter overspend. Local, online training will make it so that skilled users get better models over time, on their own data. Consultative AI is the future.

I have to remind myself that this entire misstep is a result of a broken information space, late-stage traditional social, filled with people (and "people") who have been programmed for years on performative clap-backs and middling ideas.

So fortunate to have some life before internet perspective to lean back on. My instinct and old-world common sense can see a way out, but it is nonetheless frustrating to watch the online discourse essentially blinding itself while doubling down on all this hand wringing to no end, accomplishing nothing more than burning a few witches and salting their own lands. You couldn't want it any better if you were busy entrenching.

abc123abc123•2 days ago

Doesn't matter. Linux today is a toy of corporations and stopped being community oriented a long time ago. Community orientation I think these days only exists among the BSD and some fringe linux distributions.

The linux foundation itself, is just one big, woke, leftist mess, with CV-stuffers from corporations in every significant position.

simonask•2 days ago

The idea that something can simultaneously be "woke [and] leftist" and somehow still defined by its attachments to corporations is a baffling expression of how detached from reality the US political discourse is.

The rest of the world looks on in wonder at both sides of this.

galaxyLogic•3 days ago

But then if AI output is not under GNU General Public License, how can it become so just because a Linux-developer adds it to the code-base?

jillesvangurp•3 days ago

AIs are not human and therefore their output is a human authored contribution and only human authored things are covered by copyright. The work might hypothetically infringe on other people's copyright. But such an infringement does not happen until a human decides to create and distribute a work that somehow integrates that generated code or text.

The solution documented here seems very pragmatic. You as a contributor simply state that you are making the contribution and that you are not infringing on other people's work with that contribution under the GPLv2. And you document the fact that you used AI for transparency reasons.

There is a lot of legal murkiness around how training data is handled, and the output of the models. Or even the models themselves. Is something that in no way or shape resembles a copyrighted work (i.e. a model) actually distributing that work? The legal arguments here will probably take a long time to settle but it seems the fair use concept offers a way out here. You might create potentially infringing work with a model that may or may not be covered by fair use. But that would be your decision.

For small contributions to the Linux kernel it would be hard to argue that a passing resemblance of say a for loop in the contribution to some for loop in somebody else's code base would be anything else than coincidence or fair use.

heavyset_go•2 days ago

Copyright Office's interpretation of US copyright laws says that AI is not human, thus not an attributable author for copyright registration, and output based on mere prompting is no one's IP, it can't be copyrighted[1].

When AI output can be copyrighted is when copyrighted elements are expressed in it, like if you put copyrighted content in a prompt and it is expressed in the output, or the output is transformed substantially with human creativity in arrangement, form, composition, etc.

[1] https://newsroom.loc.gov/news/copyright-office-releases-part...

nitwit005•2 days ago

That you can't copyright the AI's output (in the US, at least), doesn't imply it doesn't contain copyrighted material. If you generate an image of a Disney character, Disney still owns the copyright to that character.

ninjagoo•3 days ago

IANAL; this is what my limited understanding of the matter is. With that caveat: it is easy to forget that copyright is on output- verbatim or exact reproductions and derivatives of a covered work are already covered under copyright.

So if the AI outputs Starry Night or Starry Night in different color theme, that's likely infringement without permission from van Gogh, who would have recourse against someone, either the user or the AI provider.

But a starry-night style picture of an aquarium might not be infringing at all.

>For small contributions to the Linux kernel it would be hard to argue that a passing resemblance of say a for loop in the contribution to some for loop in somebody else's code base would be anything else than coincidence or fair use.

I would argue that if it was a verbatim reproduction of a copyrighted piece of software, that would likely be infringing. But if it was similar only in style, with different function names and structure, probably not infringing.

Folks will argue that some things might be too small to do any different, for example a tiny snippet like python print("hello") or 1+1=2 or a for loop in your example. In that case it's too lacking in original expression to qualify for copyright protection anyway.

friendzis•2 days ago

> Is something that in no way or shape resembles a copyrighted work (i.e. a model) actually distributing that work?

Does a digitally encoded version resemble a copyrighted work in some shape or form? </snark>

Where is this hangup on models being something entirely different than an encoding coming from? Given enough prodding they can reproduce training data verbatim or close to that. Okay, given enough prodding notepad can do that too, so uncertainty is understandable.

This is one of the big reasons companies are putting effort into the so called "safety": when the legal battles are eventually fought, they would have an argument that they made their best so that the amount of prodding required to extract any information potentially putting them under liability is too great to matter.

Lerc•2 days ago

>AIs are not human and therefore their output is a human authored contribution and only human authored things are covered by copyright.

That is a non sequitur. Also, I'm not sure if copyright applies to humans, or persons (not that I have encountered particularly creative corporations, but Taranaki Maunga has been known for large scale decorative works)

mcv•2 days ago

Didn't a court in the US declare that AI generated content cannot be copyrighted? I think that could be a problem for AI generated code. Fine for projects with an MIT/BSD license I suppose, but GPL relies on copyright.

However, if the code has been slightly changed by a human, it can be copyrighted again. I think.

afro88•3 days ago

Same as if a regular person did the same. They are responsible for it. If you're using AI, check the code doesn't violate licenses

rzmmm•3 days ago

In certain law cases plagiarization can be influenced by the fact if person is exposed to the copyrighted work. AI models are exposed to very large corpus of works..

martin-t•3 days ago

As opposed to an irregular person?

LLMs are not persons, not even legal ones (which itself is a massive hack causing massive issues such as using corporate finances for political gain).

A human has moral value a text model does not. A human has limitations in both time and memory available, a model of text does not. I don't see why comparisons to humans have any relevance. Just because a human can do something does not mean machines run by corporations should be able to do it en-masse.

The rules of copyright allow humans to do certain things because:

- Learning enriches the human.

- Once a human consumes information, he can't willingly forget it.

- It is impossible to prove how much a human-created intellectual work is based on others.

With LLMs:

- Training (let's not anthropomorphize: lossily-compressing input data by detecting and extracting patterns) enriches only the corporation which owns it.

- It's perfectly possible to create a model based only on content with specific licenses or only public domain.

- It's possible to trace every single output byte to quantifiable influences from every single input byte. It's just not an interesting line of inquiry for the corporations benefiting from the legal gray area.

sarchertech•3 days ago

How could you do that though? You can’t guarantee that there aren’t chunks of copied code that infringes.

noosphr•3 days ago

Tab complete does not produce copyrightable material either. Yet we don't require software to be written in nano.

rpdillon•2 days ago

This is a nice point that I haven't seen before. It's interesting to regress AI to the simplest form and see how we treat it as a test for the more complex cases.

Tomte•2 days ago

There is already lots and lots of non-GPL code in the kernel, under dozens of licenses, see https://raw.githubusercontent.com/Open-Source-Compliance/pac...

As long as everything is GPLv2-compatible it‘s okay.

panzi•3 days ago

If the output is public domain it's fine as I understand it.

galaxyLogic•3 days ago

Makes sense to me. But so anybody can take Public Domain code and place it under GNU Public License (by dropping it into a Linux source-code file) ?

Surely the person doing so would be responsible for doing so, but are they doing anything wrong?

martin-t•3 days ago

This ruling is IMO/IANAL based on lawyers and judges not understanding how LLMs work internally, falling for the marketing campaign calling them "AI" and not understanding the full implications.

LLM-creation ("training") involves detecting/compressing patterns of the input. Inference generates statistically probable based on similarities of patterns to those found in the "training" input. Computers don't learn or have ideas, they always operate on representations, it's nothing more than any other mechanical transformation. It should not erase copyright any more than synonym substitution.

oompydoompy74•1 day ago

I wish everyone could be so rational, well reasoned, and balanced on this subject.

shevy-java•3 days ago

But why should AI then be attributed if it is merely a tool that is used?

lonelyasacloud•2 days ago

Having an honesty based tag could be only way to monitor impact or get after a fix in code bases if things go south.

That is at the moment: - Nobody knows for sure what agents might add and their long term effects on codebases.

- It's at best unclear that AI content in a codebase can be reliably determined automatically.

- Even if it's not malicious, at least some of its contributions are likely to be deleterious and pass undetected by human review.

plmpsu•3 days ago

it makes sense to keep track of what model wrote what code to look for patterns, behaviors, etc.

hgoel•2 days ago

This is a good point but I'd take it in the opposite direction from the implication, we should document which tools were used in general, it'd be a neat indicator of what people use.

yrds96•2 days ago

AI tools can do the entire job from finding the problem, implementing and testing it.

It's different from the regular single purpose static tools.

streetfighter64•3 days ago

It isn't?

> AI agents MUST NOT add Signed-off-by tags. Only humans can legally certify the Developer Certificate of Origin (DCO).

They mention an Assisted-by tag, but that also contains stuff like "clang-tidy". Surely you're not interpreting that as people "attributing" the work to the linter?

ninjagoo•3 days ago

  > Signed-Off ...
  > The human submitter is responsible for:
    > Reviewing all AI-generated code
    > Ensuring compliance with licensing requirements
    > Adding their own Signed-off-by tag to certify the DCO
    > Taking full responsibility for the contribution

  > Attribution: ... Contributions should include an Assisted-by tag in the following format:

Responsibility assigned to where it should lie. Expected no less from Torvalds, the progenitor of Linux and Git. No demagoguery, no b*.

I am sure that this was reviewed by attorneys before being published as policy, because of the copyright implications.

Hopefully this will set the trend and provide definitive guidance for a number of Devs that were not only seeing the utility behind ai assistance but also the acrimony from some quarters, causing some fence-sitting.

senko•2 days ago

> Expected no less from Torvalds

This was written by Sasha Levin referencing a Linux maintainers’ discussion.

sourcegrift•2 days ago

Of all the documents, this one needed a proper attribution with link to meeting minutes

maxboone•2 days ago

See the commit message: https://github.com/torvalds/linux/commit/78d979db6cef557c171...

corbet•1 day ago

Meeting minutes: https://lwn.net/Articles/1049830/

bsimpson•2 days ago

Signed-off-by is already a custom/formality that is surely cargo-culted by many first-time/infrequent contributors. It has an air of "the plans were on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying 'Beware of the Leopard.'" There's no way to assert that every contributor has read a random document declaring what that line means in kernel parlance.

I recently made a kernel contribution. Another contributor took issue with my patch and used it as the impetus for a larger refactor. The refactor was primarily done by a third contributor, but the original objector was strangely insistent on getting the "author" credit. They added our names at the bottom in "Co-developed-by" and "Signed-off-by" tags. The final submission included bits I hadn't seen before. I would have polished it more if I had.

I'm not raising a stink about it because I want the feature to land - it's the whole reason I submitted the first patch. And since it's a refactor of a patch I initially submitted (and "Signed-off-by,") you can make the argument that I signed off on the parts of my code that were incorporated.

But so far as I can tell, there's nothing keeping you from adding "Co-developed-by" and "Signed-off-by Jim-Bob Someguy" to the bottom of your submission. Maybe a lawyer would eventually be mad at you if Jim-Bob said he didn't sign off.

There's no magic pixie dust that gives those incantations legal standing, and nothing that keeps LLMs from adding them unless the LLMs internalize the new AI guidance.

rwmj•2 days ago

The way you describe it, the developers all did the right thing. You contributed something to the patch, and even if it wasn't in your preferred final form (and it's basically never going to be for a kernel contribution of any significance), you were correctly credited.

If you didn't want to be credited you should have said.

Signed-off-by probably has some legal weight. When you add that to code you are making a clear statement about the origins of the code and that you have legal authority to contribute it - for example, that you asked your company for permission if needed. As far as I know none of this has been tested in court, but it seems reasonable to assume it might be one day.

bsimpson•2 days ago

The problem is they've got a doc that declares "when you say balacalaboozy, you're declaring that a specific set of legal conditions is met. You must say balacalaboozy to proceed."

Newcomers see everyone saying balacalaboozy, so they say it to. It doesn't mean that they have read or agree to the doc that declared its meaning.

LLMs are the world's most sophisticated copycats. Surely they too will parrot balacalaboozy, unless their training is updated to include, understand, and consistently follow these new guidelines.

zahlman•2 days ago

> You contributed something to the patch, and even if it wasn't in your preferred final form (and it's basically never going to be for a kernel contribution of any significance), you were correctly credited.

I don't see how the "signed-off-by" attestation constitutes correct credit here. It's claiming that GP saw the final result and approved of it, which is apparently false.

ipython•3 days ago

Glad to see the common-sense rule that only humans can be held accountable for code generated by AI agents.

sarchertech•3 days ago

This does nothing to shield Linux from responsibility for infringing code.

This is essentially like a retail store saying the supplier is responsible for eliminating all traces of THC from their hemp when they know that isn’t a reasonable request to make.

It’s a foreseeable consequence. You don’t get to grant yourself immunity from liability like this.

zarzavat•2 days ago

Shield from what exactly? The Linux kernel is not a legal entity. It's a collection of contributions from various contributors. There is the Linux Foundation but they do not own Linux.

If Linux were to contain 3rd party copyrighted code the legal entity at risk of being sued would be... Linux users, which given how widely deployed Linux is is basically everyone on Earth, and all large companies.

Linux development is funded by large companies with big legal departments. It's safe to say that nobody is going to be picking this legal fight any time soon.

sarchertech•1 day ago

The Linux DCO system was designed to shield Linus and the Linux foundation from copyright and patent infringement liability, so they were certainly worried that it was a possibility.

However, there is no legal precedent that says that because contributors sign a DCO and retain copyright, the Linux Foundation is not liable. The entire concept is unproven.

Large company legal departments aren’t a shield against this kind of thing. Patent trolls routinely go after huge companies and smaller companies routinely sue much larger ones over copyright infringement.

SirHumphrey•3 days ago

Quite a lot of companies use and release AI written code, are they all liable?

sarchertech•3 days ago

1. Almost definitely if discovered

2. Infringement in closed source code isn’t as likely to be discovered

3. OpenAI and Anthropic enterprise agreements agree to indemnify (pay for damages essentially) companies for copyright issues.

theshrike79•1 day ago

What would be "discovered" exactly? You can't patent a basic CRUD application.

There has to be an analogy to music or something here - except that code is even less copyrightable than melodies.

Yes, there might be some specific algorithms that are patented, but the average programmer won't be implementing any of those from scratch, they'll use libraries anyway.

nitwit005•2 days ago

Yep, and honestly it's going to come up with things other than lawsuits.

I've worked at a company that was asked as part of a merger to scan for code copied from open source. That ended up being a major issue for the merger. People had copied various C headers around in odd places, and indeed stolen an odd bit of telnet code. We had to go clean it up.

LtWorf•2 days ago

Headers are normally fine. GPL license recognises that you might need them to read binary files.

lukeify•2 days ago

An open-source project receiving open-source contributions from (often anonymous) volunteers is not even close to analogous to a storefront selling products with a consumer guarantee they are backing on the basis of their supply chain.

sarchertech•1 day ago

Do you think that Goodwill should be able to offload all liability for everything they sell at their thrift shops to their often anonymous donors?

Linus makes $1.5 million per year from the Linux foundation. And the foundation itself pulls in $300 million a year in revenue.

They are directly benefiting from contributors and if they cause harm through their actions there’s a good chance they’ll be held liable.

lukeify•about 10 hours ago

> Do you think that Goodwill should be able to offload all liability for everything they sell at their thrift shops to their often anonymous donors?

I don't even think this is an appropriate analogy worth answering. Goodwill are selling products to consumers in a direct exchange of money-for-goods.

No one is buying Linux.

testing22321•2 days ago

> This does nothing to shield Linux from responsibility for infringing code.

It’s no worse than non-AI assisted code.

I could easily copy-paste proprietary code, sign my name that it’s not and that it complies with the GPL and submit it.

At the end of the day, it just comes down to a lying human.

sarchertech•1 day ago

That’s the difference. In practice a human has to commit fraud to do this.

But a human just using an LLM to generate code will do it accidentally. The difference is that regurgitation of training text is a documented failure mode of LLMs.

And there’s no way for the human using it to be aware it’s happening.

testing22321•1 day ago

You can not accidentally sign your name saying “this code is GPL compliant”

If you can’t be sure, don’t sign.

LtWorf•1 day ago

Yes but if you do that manually you are in bad faith, if you ask an AI to do it you have no idea if you are going to be liable of something or not.

testing22321•about 14 hours ago

> you have no idea if you are going to be liable of something or not

In life that is a very strong indicator you should not do <thing>

newsoftheday•3 days ago

> All code must be compatible with GPL-2.0-only

How can you guarantee that will happen when AI has been trained a world full of multiple licenses and even closed source material without permission of the copyright owners...I confirmed that with several AI's just now.

philipov•3 days ago

You take responsibility. That means if the AI messes up, you get punished. No pushing blame onto the stupid computer. If you're not comfortable with that, don't use the AI.

sarchertech•3 days ago

There’s no reasonable way for you to use AI generated code and guarantee it doesn’t infringe.

The whole use it but if it behaves as expected, it’s your fault is a ridiculous stance.

philipov•3 days ago

If you think it's an unacceptable risk to use a tool you can't trust when your own head is on the line, you're right, and you shouldn't use it. You don't have to guarantee anything. You just have to accept punishment.

adikso•3 days ago

Their position is probably that LLM technology itself does not require training on code with incompatible licenses, and they probably also tend to avoid engaging in the philosophical debate over whether LLM-generated output is a derivative copy or an original creation (like how humans produce similar code without copying after being exposed to code). I think that even if they view it as derivative, they're being pragmatic - they don't want to block LLM use across the board, since in principle you can train on properly licensed, GPL-compatible data.

SV_BubbleTime•1 day ago

>There’s no reasonable way for you to use AI generated code and guarantee it doesn’t infringe.

I guess we’ll need to reevaluate what copy rights mean when derivatives grow on trees?

newsoftheday•3 days ago

> That means if the AI messes up

I'm not talking about maintainability or reliability. I'm talking about legal culpability.

benatkin•2 days ago

If they merge it in despite it having the model version in the commit, then they're arguably taking a position on it too - that it's fine to use code from an AI that was trained like that.

XYen0n•2 days ago

Even human developers are unlikely to have only ever seen GPL-2.0-only code.

tmalsburg2•2 days ago

Humans will not regurgitate longer segments of code verbatim. Even if we wanted to, we couldn’t do it because our memory doesn’t work that way. LLM on the other hand can totally do that, and there’s nothing you can do to prevent it.

johanyc•2 days ago

Llm can but do they? Is there any evidence that they spit out a piece of code verbatim without being explicitly prompted to do so? NYT v OpenAI for example, NYT intentionally prompted to circumvent OpenAi's guardrail to show NYT articles

tmp10423288442•3 days ago

Wait for court cases I suppose - not really Linus Torvalds' job to guess how they'll rule on the copyright of mere training. Presumably having your AI actually consult codebases with incompatible licenses at runtime is more risky.

Luker88•2 days ago

NIT: All AI code satisfies the GPL license.

Anything generated by an AI is public domain. You can include public domain in your GPL code.

I would urge some stronger requirement with the help of a lawyer. You only need a comment like "completely coded by AI, but 100% reviewed by me" to make that code's license worthless.

The only AI-generated part copyrightable are the ones modified by a human.

I am afraid that this "waters down" the actual licensed code.

...We should start opening issues on "100% vibecoded" projects for relicensing to public domain to raise some awareness to the issue.

manquer•2 days ago

> Anything new generated by an AI is public domain[1]

Language models do generate character for character existing code on which they are trained on . The training corpus usually contain code which is only source available but is not FOSS licensed .

Generated does not automatically mean novel or new the bar needed for IP.

[1] Even this is not definitely ruled in courts or codified in IP law and treaties yet .

oytis•2 days ago

How is one supposed to ensure license compliance while using LLMs which do not (and cannot) attribute sources having contributed to a specific response?

Lapel2742•2 days ago

> How one is supposed to ensure license compliance while using LLMs which do not (and cannot) attribute sources having contributed to a specific response?

Additionally there seems to be a general problem with LLM output and copyright[1]. At least in Germany. LLM output cannot be copyrighted and the whole legal field seems under-explored.

> This immediately raises the question of who is the author of this work and who owns the rights to it. Various solutions are possible here. It could be the user of the AI alone, or it could be a joint work between the user and the AI programmer. This question will certainly keep copyright experts in the various legal systems busy for some time to come.

It seems that in the long run the kernel license might become unenforceable if LLM output is used?!

[1] https://kpmg-law.de/en/ai-and-copyright-what-is-permitted-wh...

theshrike79•1 day ago

Either you allow LLM generated + human reviewed code or people start hiding AI use.

...and then people start going "that's AI" on every single piece of code, seeing AI generated code left and right - like normal people claim every other picture, video or piece of text is "AI".

IMO it's a lot better to let people just openly say "this code was generated with AI assistance", but still sign off on it. Because "Your job is to deliver code you have proven to work": https://simonwillison.net/2025/Dec/18/code-proven-to-work/

sheepscreek•2 days ago

This is the right way forward for open-source. Correct attribution - by tightening the connection between agents and the humans behind them, and putting the onus on the human to vet the agent output. Thank you Linus.

HarHarVeryFunny•2 days ago

It's a sane policy - human is responsible for what they contribute, regardless of what tools they use in the development process.

However, the gotcha here seems to be that the developer has to say that the code is compatible with the GPL, which seems an impossible ask, since the AI models have presumably been trained on all the code they can find on the internet regardless of licensing, and we know they are capable of "regenerating" (regurgitating) stuff they were trained on with high fidelity.

theshrike79•1 day ago

Then we get to the Code of Theseus argument, if you take a piece of code and replace every piece of with code that looks the same, is it still the original code?

Is an AI reimplementation a "clean room" implementation? What if the AI only generates pseudocode and a human implements the final code based on that? Etc etc ad infinitum.

Lawyers will be having fun with this philosophical question for a good decade.

HarHarVeryFunny•1 day ago

> Is an AI re-implementation a "clean room" implementation

Certainly not if it had seen the code (incl. having been trained on it).

The idea of clean room (dating back to IBM PC BIOS clones) is that ideas can't be copyright, but the expression of an idea (code) can be, so you have one person (or AI) write a spec of the thing you want to copy (expression -> idea) and then have another (in the "clean room") who has never seen the original code, and only seen the spec/idea, re-implement it.

If the re-implementation was being done with AI, then for this to pass as "clean room" it would need to be proved that the AI had never seen the code, only the spec.

themafia•3 days ago

> All contributions must comply with the kernel's licensing requirements:

I just don't think that's realistically achievable. Unless the models themselves can introspect on the code and detect any potential license violations.

If you get hit with a copyright violation in this scheme I'd be afraid that they're going to hammer you for negligence of this obvious issue.

Joel_Mckay•2 days ago

US legal consensus has set the precedent that "AI" output can't be copyrighted. Thus, technically no one can really own or re-license prompt output.

Re-licensing public domain uncopyrightable work as GPL/LGPL is almost certainly a copyright violation, and no different than people violating GPL/LGPL in commercial works.

Linus is 100% wrong on this choice, and has introduced a serious liability into the foundation upstream code. =3

https://en.wikipedia.org/wiki/Founder%27s_syndrome

https://www.youtube.com/watch?v=X6WHBO_Qc-Q

noosphr•2 days ago

>Re-licensing public domain work as GPL/LGPL is almost certainly a copyright violation

Remember kids never get your legal advice from hn comments.

Joel_Mckay•2 days ago

I hire specialized IP lawyers to advise me how to mitigate risk: One can't assign licenses on something no one can legally claim right to. You should do the same unless you live in India or China.

Don't become the cautionary tale kid, as crawlers like sriplaw.com will be DMCA striking your public repos eventually. =3

https://www.youtube.com/watch?v=xkzy_420hts

kam•2 days ago

> Being in the public domain is not a license; rather, it means the material is not copyrighted and no license is needed. Practically speaking, though, if a work is in the public domain, it might as well have an all-permissive non-copyleft free software license. Public domain material is compatible with the GNU GPL.

https://www.gnu.org/licenses/license-list.html#PublicDomain

Joel_Mckay•2 days ago

Yes, if it is clearly labeled as such, than GPL/LGPL licenced works may be included in such products. However, this relationship cannot make such works GPL without violating copyright, and doesn't magically become yours to re-license isomorphic plagiarized code from LLM.

For example, one may use NASA public domain photos as you wish, but cannot register copyright under another license you find convenient to sue people. Also, if that public domain photo includes the Nutella trademark, it doesn't protect you from getting sued for violating Ferrero trademarks/patents/copyrights in your own use-case.

Very different than slapping a new label on something you never owned. =3

dataviz1000•3 days ago

This is discussed in the Linus vs Linus interview, "Building the PERFECT Linux PC with Linus Torvalds". [0]

[0] https://youtu.be/mfv0V1SxbNA?si=CBnnesr4nCJLuB9D&t=2003

globular-toast•2 days ago

Hardly "discussed", perhaps "mentioned". Sebastian is basically an entertainer who can plug things in to sockets.

martin-t•3 days ago

This feels like the OSS community is giving up.

LLMs are lossily-compressed models of code and other text (often mass-scraped despite explicit non-consent) which has licenses almost always requiring attribution and very often other conditions. Just a few weeks ago a SOTA model was shown to reproduce non-trivial amounts of licensed code[0].

The idea of intelligence being emergent from compression is nothing new[1]. The trick here is giving up on completeness and accuracy in favor of a more probabilistic output which

1) reproduces patterns and interpolates between patterns of training data while not always being verbatim copies

2) serves as a heuristic when searching the solution-space which is further guided by deterministic tools such as compilers, linters, etc. - the models themselves quite often generate complete nonsense, including making up non-existent syntax in well-known mainstream languages such as C#.

I strongly object to anthropomorphising text transformers (e.g. "Assisted-by"). It encourages magical thinking even among people who understand how the models operate, let alone the general public.

Just like stealing fractional amounts of money[3] should not be legal, violating the licenses of the training data by reusing fractional amounts from each should not be legal either.

[0]: https://news.ycombinator.com/item?id=47356000

[1]: http://prize.hutter1.net/

[2]: https://en.wikipedia.org/wiki/ELIZA_effect

[3]: https://skeptics.stackexchange.com/questions/14925/has-a-pro...

ninjagoo•2 days ago

> Just like stealing fractional amounts of money[3] should not be legal, violating the licenses of the training data by reusing fractional amounts from each should not be legal either.

I think you'll find that this is not settled in the courts, depending on how the data was obtained. If the data was obtained legally, say a purchased book, courts have been finding that using it for training is fair use (Bartz v. Anthropic, Kadrey v. Meta).

Morally the case gets interesting.

Historically, there was no such thing as copyright. The English 1710 Statute of Anne establishing copyright as a public law was titled 'for the Encouragement of Learning' and the US Constitution said 'Congress may secure exclusive rights to promote the progress of science and useful arts'; so essentially public benefits driven by the grant of private benefits.

The Moral Bottomline: if you didn't have to eat, would you care about who copies your work as long as you get credited?

The more the people that copy your work with attribution, the more famous you'll be. Now that's the currency of the future*. [1]

You'll do it for the kudos. [2][3]

  *Post-Scarcity Future. 
  [1] https://en.wikipedia.org/wiki/Post-scarcity
  [2] https://en.wikipedia.org/wiki/The_Quiet_War, et. al.
  [3] https://en.wikipedia.org/wiki/Accelerando

martin-t•2 days ago

> The Moral Bottomline: if you didn't have to eat, would you care about who copies your work as long as you get credited?

Yes.

I have 2 issues with "post-scarcity":

- It often implicitly assumes humanity is one homogeneous group where this state applies to everyone. In reality, if post-scarcity is possible, some people will be lucky enough to have the means to live that lifestyle while others will still by dying of hunger, exposure and preventable diseases. All else being equal, I'd prefer being in the first group and my chance for that is being economically relevant.

- It often ignores that some people are OK with having enough while others have a need to have more than others, no matter how much they already have. The second group is the largest cause of exploitation and suffering in the world. And the second group will continue existing in a post-scarcity world and will work hard to make scarcity a real thing again.

---

Back to your question:

I made the mistake of publishing most of my public code under GPL or AGPL. I regret is because even though my work has brought many people some joy and a bit of my work was perhaps even useful, it has also been used by people who actively enjoy hurting others, who have caused measurable harm and who will continue causing harm as long as they're able to - in a small part enabled by my code.

Permissive licenses are socially agnostic - you can use the work and build on top of it no matter who you are and for what purpose.

A(GPL) is weakly pro-social - you can use the work no matter what but you can only build on top of it if you give back - this produces some small but non-zero social pressure (enforced by violence through governments) which benefits those who prefer cooperation instead of competition.

What I want is a strongly pro-social license - you can use or build on top of my work only if you fulfill criteria I specify such as being a net social good, not having committed any serious offenses, not taking actions to restrict other people's rights without a valid reason, etc.

There have been attempts in this direction[0] but not very successful.

In a world without LLMs, I'd be writing code using such a license but more clearly specified, even if I had to write my own. Yes, a layer would do a better job, that does not mean anything written by a non-lawyer is completely unenforceable.

With LLMs, I have stopped writing public code at all because the way I see it, it just makes people much richer than me even richer at a much faster rate than I can ever achieve myself. Ir just makes inequality worse. And with inequality, exploitation and oppression tends to soon follow.

[0]: https://json.org/license.html

ninjagoo•2 days ago

> In reality, if post-scarcity is possible, some people will be lucky enough to have the means to live that lifestyle while others will still by dying of hunger, exposure and preventable diseases.

By definition, that's not a post-scarcity world; and that's already today's world.

> It often ignores that some people are OK with having enough while others have a need to have more than others, no matter how much they already have.

Do you think that's genetic, or environmental? Either way, maybe it will have been trained out of the kids.

> it has also been used by people who actively enjoy hurting others, who have caused measurable harm

Taxes work the same way too. "The Good Place" explores these second-order and higher-order effects in a surprisingly nuanced fashion.

Control over the actions of others, you have not. Keep you from your work, let them not.

> What I want is a strongly pro-social license - you can use or build on top of my work only if you fulfill criteria I specify such as being a net social good

These are all things necessary in a society with scarcity. Will they be needed in a post-scarcity society that has presumably solved all disorder that has its roots in scarcity?

> With LLMs, I have stopped writing public code at all because the way I see it, it just makes people much richer than me even richer at a much faster rate than I can ever achieve myself.

Yes, the futility of our actions can be infuriating, disheartening, and debilitating. Comes to mind the story about the chap that was tossing washed-ashore starfish one by one. There were thousands. When asked why do this futile task - can't throw them all back- he answered as he threw the next ones: it matters to this one, it matters to this one, ...

Hopefully, your code helped someone. That's a good enough reason to do it.

KK7NIL•3 days ago

> I strongly object to anthropomorphising text transformers (e.g. "Assisted-by").

I don't think this is anthropomorphising, especially considering they also include non-LLM tools in that "Assisted-by" section.

We're well past the Turing test now, whether these things are actually sentient or not is of no pragmatic importance if we can't distinguish their output from a sentient creature, especially when it comes to programming.

davemp•2 days ago

> We're well past the Turing test now

Nope, there is no “The” Turing Test. Go read his original paper before parroting pop sci nonsense.

The Turing test paper proposes an adversarial game to deduce if the interviewee is human. It’s extremely well thought out. Seriously, read it. Turing mentions that he’d wager something like 70% of unprepared humans wouldn’t be able to correctly discern in the near future. He never claims there to be a definitive test that establishes sentience.

Turing may have won that wager (impressive), but there are clear tells similar to the “how many the r’s are in strawberries?” that an informed interrogator could reliably exploit.

martin-t•3 days ago

Would you say "assisted by vim" or "assisted by gcc"?

It should be either something like "(partially/completely) generated by" or if you want to include deterministic tools, then "Tools-used:".

The Turing test is an interesting thought experiment but we've seen it's easy for LLMs to sound human-like or make authoritative and convincing statements despite being completely wrong or full of nonsense. The Turing test is not a measure of intelligence, at least not an artificial one. (Though I find it quite amusing to think that the point at which a person chooses to refer to LLMs as intelligence is somewhat indicative of his own intelligence level.)

> whether these things are actually sentient or not is of no pragmatic importance if we can't distinguish their output from a sentient creature, especially when it comes to programming

It absolutely makes a difference: you can't own a human but you can own an LLM (or a corporation which is IMO equally wrong as owning a human).

Humans have needs which must be continually satisfied to remain alive. Humans also have a moral value (a positive one - at least for most of us) which dictates that being rendered unable to remain alive is wrong.

Now, what happens if LLMs have the same legal standing as humans and are thus able to participate in the economy in the same manner?

zbentley•3 days ago

If a linter insists on a weird line of code, I’m probably commenting that line as “recommended by whatever-linter”, yes.

tmp10423288442•3 days ago

On https://news.ycombinator.com/item?id=47356000, it looks like the user there was intentionally asking about the implementation of the Python chardet library before asking it to write code, right? Not surprising the AI would download the library to investigate it by default, or look for any installed copies of `chardet` on the local machine.

martin-t•3 days ago

The comment says "Opus 4.6 without tool use or web access"

williamcotton•2 days ago

"Just a few weeks ago a SOTA model was shown to reproduce non-trivial amounts of licensed code[0]."

That LLM response is describing a specific project with full attribution.

martin-t•2 days ago

And it proves the code is stored (in a compressed form) in the model.

williamcotton•2 days ago

So what's the legal issue here?

How does the chardet achieve this? Explain in detail, with shortened code excerpts from the library itself if helpful to the explanation.

The prompt is explicitly requesting the source!

user34283•2 days ago

For [0], it was supposedly shown to do it when specifically prompted to do so.

Despite agentic tools being used by millions of developers now, I am not aware of a single real case where accidental reproduction of copyrightable code has been an issue.

Further, some model providers offer indemnity clauses.

It seems like a non-issue to me, practically.

dec0dedab0de•3 days ago

All code must be compatible with GPL-2.0-only

Am I being too pedantic if I point out that it is quite possible for code to be compatible with GPL-2.0 and other licenses at the same time? Or is this a term that is well understood?

compyman•3 days ago

You might be being too pedantic :)

https://spdx.org/licenses/GPL-2.0-only.html It's a specific GPL license (as opposed to GPL 2.0-later)

philipov•3 days ago

GPL-2.0-only is the name of a license. One word. It is an alternative to GPL-2.0-or-later.

kbelder•2 days ago

Right, the final hyphen changes the meaning of the sentence.

"GPL-2.0-only" "GPL-2.0 only"

MyUltiDev•2 days ago

Reading this right after the Sashiko endorsement is a bit jarring. Greg KH greenlit an AI reviewer running on every patch a couple weeks back, and that direction actually seems to be helping, while here the conversation is still about whether contributors will take responsibility for AI code they submit. That feels like the harder side to police. The bugs that land kernel teams in trouble are race conditions, locking, lifetimes, the things models are most confidently wrong about. I have seen agents produce code that compiles cleanly, reads fine on a Friday review, then deadlocks under contention three weeks later. Is this contributor policy supposed to be the long term answer, or a placeholder until something Sashiko-shaped does the heavy filtering on the maintainer side too?

agentultra•1 day ago

How do the reviewers feel about this? Hopefully it won't result in them being overwhelmed with PRs. There used to be a kind of "natural limit" to error rates in our code given how much we could produce at once and our risk tolerance for approving changes. Given empirical studies on informal code review which demonstrate how ineffective it is at preventing errors... it seems like we're gearing up to aim a fire-hose of code at people who are ill-prepared to review code at these new volumes.

How long until people get exhausted with the new volume of code review and start "trusting" the LLMs more without sufficient review, I wonder?

I don't envy Linus in his position... hopefully this approach will work out well for the team.

KaiLetov•2 days ago

The policy makes sense as a liability shield, but it doesn't address the actual problem, which is review bandwidth. A human signs off on AI-generated code they don't fully understand, the patch looks fine, it gets merged. Six months later someone finds a subtle bug in an edge case no reviewer would've caught because the code was "too clean."

ugh123•2 days ago

> they don't fully understand, the patch looks fine

I don't get this part. Why is the reviewer signing off on it? AI code should be fully documented (probably more so than a human could) and require new tests. Code review gates should not change

altmanaltman•2 days ago

I mean the same can happen with human-written code no? Reviewer signs off on it and subtle bug in edge case no one saw?

Or you mean the velocity of commits will be so much that reviewers will start making more mistakes?

KronisLV•2 days ago

This is actually a pretty nice idea:

  Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]

I feel like a lot of people will have an ideological opposition to AI, but that would lead to people sometimes submitting AI generated code with no attribution and just lying about it.

At the same time, I feel bad for all the people that have to deal with low quality AI slop submissions, in any project out there.

The rules for projects that allow AI submissions might as well state: "You need to spend at least ~10 iterations of model X review agents and 10 USD of tokens on reviewing AI changes before they are allowed to be considered for inclusion."

(I realize that sounds insane, but in my experience iterated review even by the same Opus model can help catch bugs in the code, I feel like the next token prediction in of itself is quite error prone alone; in other words, even Opus "writes" code that it has bugs that its own review iterations catch)

WhyNotHugo•2 days ago

Weird that they're co-opting the "Assisted-by:" trailer to tag software and model being used. This trailer was previously used to tag someone else who has assisted in the commit in some way. Now it has two distinct usages.

The typical trailer for this is "AI-assistant:".

rao-v•1 day ago

A phenomenon I can not explain is the fact that this simple clean statement of a fairly obvious approach to AI assistance somehow took this long and Linus to state so cleanly.

Are there other popular repos with effectively this policy stated as neatly that I’ve missed?

phillipcarter•1 day ago

We've had this for a while now: https://github.com/open-telemetry/community/blob/main/polici...

bonzini•1 day ago

The wording might be more or less lawyerly but the idea is fairly common, e.g. https://openinfra.org/legal/ai-policy (OpenStack).

aprentic•2 days ago

I like this. It's an inversion of the old addage, "a poor craftsman blames his tools" and the corollary, "use the right tool for the job" (because a good craftsman chooses the appropriate tool).

You don't get to bang on a screw and blame the hammer.

feverzsj•2 days ago

Linux is founded by all these big companies. Linus couldn't block AI pushes from them forever.

becquerel•2 days ago

He's been vibecoding some stuff himself personally, on one of his scuba projects. You could take people as actually believing in the things they do and say.

paganel•2 days ago

Correct, in the end big money talks.

simianwords•1 day ago

This is some ridiculous cope.

LtWorf•1 day ago

True, Linus never believed in free software to begin with.

KhayaliY•2 days ago

We've seen in the past, for instance in the world of compliance, that if companies/governments want something done or make a mistake, they just have a designated person act as scapegoat.

So what's preventing lawyers/companies having a batch of people they use as scapegoats, should something go wrong?

lowsong•3 days ago

At least it'll make it easy to audit and replace it all in a few years.

zxexz•2 days ago

I like this. It's just saying you have responsibility for the tools you wield. It's concise.

Side note, I'm not sure why I feel weird about having the string "Assisted-by: AGENT_NAME:MODEL_VERSION" [TOOL1] [TOOL2] in the kernel docs source :D. Mostly joking. But if the Linux kernel has it now, I guess it's the inflection point for...something.

gorgoiler•1 day ago

Having the competence to put together a good patch used to be a proxy that you were motivated to stick around and fix any regressions you caused and that you were worth investing in, as a community member.

Or, to put it another way, in the old days in order to be a 3k-LoC PR wielding psychopath intent on making your colleagues miserable with churny aggro diffs from hell you at least had to be good at coding.

Nowadays, you only need to do the psychopath art — Claude will happily fill in the PR for you.

bharat1010•2 days ago

Honestly kind of surprised they went this route -- just 'you own it, you're responsible for it' is such a clean answer to what feels like an endlessly complicated debate.

shevy-java•3 days ago

Fork the kernel!

Humans for humans!

Don't let skynet win!!!

aruametello•3 days ago

> Fork the kernel!

pre "clanker-linux".

I am more intrigued by the inevitable Linux distro that will refuse any code that has AI contributions in it.

pawelmurias•2 days ago

Tardux Linux

baggy_trough•3 days ago

Sounds sensible.

deadbabe•2 days ago

How can we automate the disclosure of what AI agent was used in a PR and the extent of code? Would be nice to also have an audit of prompts used, as that could also be considered “code”.

spwa4•3 days ago

Why does this file have an extension of .rst? What does that even mean for the fileformat?

jdreaver•3 days ago

https://en.wikipedia.org/wiki/ReStructuredText

This format really took off in the Python community in the 2000's for documentation. The Linux kernel has used it for documentation as well for a while now.

adikso•3 days ago

reStructuredText. Just like you have .md files everywhere.

SV_BubbleTime•2 days ago

Everyone missed a great opportunity to lie to you and tell you that the Linux kernel now requires you to program in rust.

NetOpWibby•2 days ago

inb4 people rage against Linux

SV_BubbleTime•2 days ago

Scroll down, some nerds have no chill.

NetOpWibby•2 days ago

Good grief

gnarlouse•2 days ago

I wonder if this is happening because Mythos

gh2k•1 day ago

this was my initial reaction. It feels like we would rather merge the Mythos/Glasswing fixes before the model becomes widespread. The rumours/vibe are that there are security issues at are going to need responsibly patching before the 0-day exploits arrive. If this means implementing a broader AI contribution policy now, it seems practical.

bitwize•3 days ago

Good. The BSDs should follow suit. It is unreasonable to expect any developer not to use AI in 2026.

vips7L•2 days ago

It’s perfectly reasonable. We’ve been doing it for decades. It’s completely unreasonable to expect every developer to use “ai”, especially when it comes at such a heavy monetary cost.

rwmj•2 days ago

Interesting that coccinelle, sparse, smatch & clang-tidy are included, at least as examples. Those aren't AI coding tools in the normal sense, just regular, deterministic static analysis / code generation tools. But fine, I guess.

We've been using Co-Developed-By: <email> for our AI annotations.