Kernel code removals driven by LLM-created security reports

eedward about 4 hours ago 61 commentsRead Article on lwn.net

⚡ Community Insights

Discussion Sentiment

45% Positive

Analyzed from 3001 words in the discussion.

Discussion (61 Comments)Read Original on HackerNews

s20n•about 1 hour ago

While I know that it may have been a security liability, I'm particularly sad that they're removing the AX.25 module from the kernel.

> and since nobody stepped up to help us deal with the influx of the AI-generated bug reports we need to move it out of tree to protect our sanity.

This thread from the linux-hams mailing list [2] has more insight into this decision. I guess the silver lining is that, more modern protocols (in userspace), written in modern languages will become the norm for HAM radio on linux now.

[1] : <https://lwn.net/ml/all/20260421021824.1293976-1-kuba@kernel....>

[2] : <https://lore.kernel.org/linux-hams/CAEoi9W5su6bssb9hELQkfAs7...>

ajross•about 1 hour ago

> more modern protocols (in userspace)

That's really it. The list of things that "need" to be in the kernel is shrinking steadily, and the downsides of having C code running in elevated privilege levels are increasing. None of that is about LLMs at all, except to the extent that it's a notable inflection point in a decades-scale curve.

The future, and we basically all agree, puts complexities like protocol handling and state in daemons and leaves only the hardware, process and I/O management in the kernel.

Basically, Tannenbaum was right about the design but wrong about the schedule and path to get there.

anthk•25 minutes ago

Except it's several times slower doing TCP/IP in userspace with programs than having a proper kernel for it, that's it, Hurd.

rwmj•22 minutes ago

I don't think this is actually true (eg. DPDK), but even if it is, you can put the driver in userspace (tun/tap + vfio/libusb/ioport/...) and still use TCP/IP in the kernel.

KJs6ZxELzQM37O•about 1 hour ago

A lot of money seems to be placed to find bugs in open source projects right now... maybe they can spend just a little bit of this money on people to fix these bugs

rob74•23 minutes ago

What's actually happening is that LLMs enable you to find much more bugs using the same amount of money. Of course, you could say that the Linux kernel developers could also use AI tools, but I think they have good reasons to be skeptical about that...

NooneAtAll3•about 1 hour ago

can such drivers be moved out of kernel? what exactly stops that?

why do they even need to be in kernel repo and not brought at/after install time?

drewg123•about 1 hour ago

Linux is actively hostile to out-of-tree drivers. There is no stable driver API, and interfaces change at the drop of a hat. Maintaining an out of tree driver is a constant nightmare where you're always dealing with interfaces changing out from under you.

I wrote and maintained 10GbE drivers for a small company in the 2000s, and just the SHIM file for our driver on Linux to massage over API differences was well over 1000 lines. I think it was close to the same size as the entire driver for one of the BSDs.

gslepak•about 1 hour ago

> why do they even need to be in kernel

People have been asking this question since Linux was first invented…

s20n•about 1 hour ago

I could never in a million years have imagined that LLM-slop driven fuzzing would become the ultimate vindication for the microkernel philosophy

mmsc•about 2 hours ago

Unmaintained code is a security issue in of itself, so this is of course a net benefit.

xbar•about 1 hour ago

This can be accurately generalized: code is a security issue in and of itself.

mey•about 1 hour ago

Now if only I could get the product team to fully understand that implication.

catlifeonmars•about 1 hour ago

This can be generalized: in and of itself.

cozzyd•about 2 hours ago

Seems like there should be some "level of maintenance" metric for modules and distros can pick which they include by default and which are packaged separately based on what they care about. Arch users will build the world but an EL user who needs an unmaintained module would have to explicitly install kmod-isdn or even build it themselves

doubled112•about 1 hour ago

Red Hat already removes a bunch of modules/drivers from the RHEL kernel that they don't consider enterprise.

Xbox/PS controllers, for example. I believe some old RAID controller and WiFi drivers are removed too. Whatever they don't want to support.

rwmj•19 minutes ago

(Working for Red Hat) We actually opt devices in to our kernel. A couple of years ago an overzealous kernel maintainer removed the watchdog drivers used by qemu from the kernel and it took me ages to get those added back in.

sscaryterry•about 1 hour ago

Here's the thing, all of these problems are pre-existing. All LLMs are doing is shining a big bright light on it.

socratic_weeb•14 minutes ago

We are talking about drivers for devices from the last century which nobody even uses anymore. This isn't "shining light" on important pre-existing issues that have been ignored for too long or something, it isn't helping.

The only problem here, if any, is the false sense of confidence given by LLMs to people who have no business touching kernel code.

ferguess_k•about 2 hours ago

Are we already in the time, or close to the time, that well-trained LLMs are more efficient in finding security holes than all but the best developers out there, even for OS kernel code? Can someone educate me on this?

stratos123•about 2 hours ago

In terms of quantity, definitely yes (a single person managing a swarm of Opusi can already find much more real bugs than a security researcher, hence the rise in reports).

In terms of quality ("are there bugs that professional humans can't see at any budget but LLMs can?") - it's not very clear, because Opus is still worse than a human specialist, but Mythos might be comparable. We'll just have to wait and see what results Project Glasswing gets.

Either way, cybersecurity is going to get real weird real soon, because even slightly-dumb models can have a large effect if they are cheap and fast enough.

EDIT: Mozilla thinks "no" to the second question, by the way: "Encouragingly, we also haven’t seen any bugs that couldn’t have been found by an elite human researcher.", when talking about the 271 vulnerabilities recently found by Mythos. https://blog.mozilla.org/en/firefox/ai-security-zero-day-vul...

DanielHB•about 2 hours ago

There is also a huge surface area of security problems that can't happen in practice due to how other parts of the code work. A classic example is unsanitized input being used somewhere where untrusted users can't inject any input.

Being flooded with these kind of reports can make the actual real problems harder to see.

chuckadams•about 2 hours ago

> Opusi

The plural of "Opus" is "Opera". Might be a tad confusing tho :)

skeledrew•34 minutes ago

Wondered for a second "what does that browser have to do with all this?"

yk•about 2 hours ago

My theory is, that a lot of security bugs are low hanging fruit for LLMs in the sense that it is a bit tedious but not that hard pattern matching. (Let's see the free occurs in foo(), so if I trigger bar() after foo() then I have a use after free, that should be possible if I trigger an exception in baz::init().)

jcalvinowens•about 2 hours ago

My experience with these tools is that they generate absolutely enormous amounts of insidiously wrong false positives, and it actually takes a decent amount of skill to work through the 99% which is garbage with any velocity.

Of course some people don't do that, and send all the reports anyway... and then scream from the hilltops about how incredible LLMs are when by sheer luck one happens to be right. Not only is that blatant p-hacking, it's incredibly antisocial.

It's disingenuous marketing speak to say LLMs are "finding" any security holes at all: they find a thousand hypotheticals of which one or two might be real. A broken clock is right twice a day.

binaryturtle•about 1 hour ago

I used GitHub's Copilot once and let it check one of my repositories for security issues. It found countless (like 30 or 40 or so for a single PHP file of some ~400 lines). Some even sounded reasonable enough, so I had a closer look, just to make sure. In the end none of it was an issue at all. In some cases it invented problems which would have forced to add wild workaround code around simple calls into the PHP standard library. And that was the only time I wasted my time with that. :D

NitpickLawyer•about 2 hours ago

Your experience seems to be at least 3-6 months old. Long time kernel maintainers have recently written on this subject. They say that ~3 months ago the quality and accuracy of the reports crossed a threshold and are now legitimately useful.

jcalvinowens•42 minutes ago

The experience I'm describing was two weeks ago.

Yes, what we see coming out of the bottom of funnel is now is a little better. But it's sort of like reading day trading blogs: nobody shares their negative results, which in my direct experience are so bad they almost negate any investigative benefit. I also think part of this is that a small set of very prolific spammers were sufficiently discouraged to stop.

Legend2440•about 2 hours ago

This is incorrect. Here's the curl maintainer talking about dozens of bugs found using LLMs: https://daniel.haxx.se/blog/2025/10/10/a-new-breed-of-analyz...

warkdarrior•44 minutes ago

From the curl blog post:

> "Remarkably few of them complete false positives."

bri3d•about 1 hour ago

I strongly disagree with this take, and frankly, this reads like the state of "research" pre-LLMs where people would run fuzzers and scripted analysis tools (which by their nature DO generate enormous amounts of insidiously wrong false positives) and stuff them into bug bounty boxes, then collect a paycheck when one was correct by luck.

Modern LLMs with a reasonable prompt and some form of test harness are, in my experience, excellent at taking a big list of potential vulnerabilities and figuring out which ones might be real. They're also pretty good, depending on the class of vuln and the guardrails in the model, at developing a known-reachable vulnerability into real exploit tooling, which is also a big win. This does require the _slightest_ bit of work (ie - don't prompt the LLM with "find possible use after free issues in this code," or it will give you a lot of slop; prompt the LLM with "determine whether the memory safety issues in this file could present a security risk" and you get somewhere), but not some kind of elaborate setup or prompt hacking, just a little common sense.

LeCompteSftware•about 1 hour ago

"Even for OS kernel code" is doing a lot of work. What you really mean is "legacy C code" and yes, since about 6 months ago these systems have gotten reliable enough that they are basically superhuman at identifying buffer overflows / etc. A remarkable number of these bugs are fixed by adding a (if (length > MAX_BUFFER) {return -1;}), just the classic C footguns. Even as a huge LLM skeptic I am not too too surprised that these systems might be superhuman at finding tedious tricky stuff like this.

At the same time, a lot of these bugs were in places that people weren't looking because it's not actually important. This kernel code had already been a longstanding problem in terms of low-effort bot-driven security reports and nobody had any interest in maintaining it. So this was more LLM-assisted technical management than LLM-assisted security, it finally made a situation uncomfortable enough for the team to do something about it.

Another example: Mythos found a real bug in FreeBSD that occurs when running as an NFS with a public connection. But... who on earth is doing that? I would guess 99.9% of FreeBSD NFS installations are on home LANs. More importantly, Anthropic spent $20,000 to find this bug. Just think in terms of paying a full-time FreeBSD dev for a month and that's what they find: I'd say "ok, looks like FreeBSD has a pretty secure codebase, let's fix that stupid bug, stop wasting our money, and get you on a more exciting project."

I do think anyone who has a legacy open-source C/C++ codebase owes it to their users to run it by Claude/Codex, check your pointers and arrays, make sure everything looks ok. I just wish people were able to discuss it in proper context about other native debugging tools!

traceroute66•about 2 hours ago

> well-trained LLMs are more efficient in finding security holes than all but the best developers out there, even for OS kernel code?

No.

Like everything else an LLM touches, it is prone to slop and hallucinations.

You still need someone who knows what they are doing to review (and preferably manually validate) the findings.

What all this recent hype carefully glosses over is the volume of false-positives. I guarantee you it is > 0 and most likely a fairly large number.

And like most things LLM, the bigger the codebase the more likely the false-positives due to self-imposed context window constraints.

Its all very well these blog posts saying "LLM found this serious bug in Firefox", well yeah but that's only because the security analyst filtered out all the junk (and knew what to ask the LLM in the prompt in the first place).

stratos123•about 2 hours ago

A 0% false-positive rate is not necessary for LLM-powered security review to be a big deal. It was worthless a few months ago, when the models were terrible at actually finding vulnerabilities and so basically all the reports were confabulated, with a false positive rate of >95%. Nowadays things are much better - see e.g. [1] by a kernel maintainer.

Another way to see this is that you mentioned "LLM found this serious bug in Firefox", but the actual number in that Mozilla report [2] was 14 high-severity bugs, and 90 minor ones. However you look at it, it's an impressive result for a security audit, and I dount that the Antropic team had to manually filter out hundreds-to-thousands of false-positives to produce it.

They did have to manually write minimal exploits for each bug, because Opus was bad at it[3]. This is a problem that Mythos doesn't have. With access to Mythos, to repeat the same audit, you'd likely just need to make the model itself write all the exploits, which incidentally would also filter out a lot of the false positives. I think the hype is mostly justified.

[1] https://lwn.net/Articles/1065620/

[2] https://blog.mozilla.org/en/firefox/hardening-firefox-anthro...

[3] https://www.anthropic.com/news/mozilla-firefox-security

traceroute66•14 minutes ago

> A 0% false-positive rate is not necessary

To be clear, I'm not saying 0% false-positive because that will always be impossible with any LLM.

However, to greatly over-simplify what I already said ...

The presence of >0 false-positives means you still need someone who knows what they are doing behind the keyboard.

The presence of an LLM, no matter how good, will never remove the need for a human with domain expertise in security analysis.

You cannot blindly fix stuff just because the LLM says it needs fixing.

You cannot report stuff just because the LLM says it needs reporting.

There may well be scope for LLM-assisted workflows, but WHO is being assisted is a critical part of the equation.

That is the fundamental point I am making.

olmo23•about 2 hours ago

We are there. This is pretty much the reason why Mythos isn't being released publically.

pocksuppet•about 2 hours ago

The reason Mythos isn't being released publicly is to drive up Anthropic's valuation by making big promises.

dymk•about 2 hours ago

https://blog.mozilla.org/en/privacy-security/ai-security-zer...

> As part of our continued collaboration with Anthropic, we had the opportunity to apply an early version of Claude Mythos Preview to Firefox. This week’s release of Firefox 150 includes fixes for 271 vulnerabilities identified during this initial evaluation.

bri3d•about 1 hour ago

"More efficient" of course has many axes (cost, energy consumption, manual labor requirement vs cost of human, time, quality, etc.). However, as a long-time reverse engineer and exploit developer who has worked in the field professionally, I would say LLMs are now useful; their utility exceeds that which was previously available. That is, LLM assisted exploit discovery and especially development is faster, more efficient, and ultimately cheaper than non-LLM assisted processes.

What commenters don't seem to understand is that especially CVE spam / bug bounty type vulnerability research has always been an exercise in sifting through useless findings and hallucinations, and LLMs, used well, are great at reducing this burden.

Previously, a lot of "baseline" / bottom tier research consisted of "run fuzzers or pentest tools against a product; if you're a bottom feeder just stuff these vulns all into the submission box, if you're more legit, tediously try to figure out which ones are reachable." LLMs with a test harness do an _amazing_ job at reducing this tedium; in the memory safety space "read across 50 files to figure out if this UAF might be reachable" or in the web space, "follow this unsanitized string variable to see if it can be accessed by the user" are tasks that LLMs with a harness are awesome. The current models are also about 50% there at "make a chain for this CVE," depending on the shape of the CVE (they usually get close given a good test harness).

It seems that the concern with the unreleased models is pretty much that this has advanced once again from where it is today (where you need smart prompting and a good harness) to the LLM giving you exploit chains in exchange for "giv 0day pl0x," and based on my experience, while this has got an element of puffery and classic capitalist goofiness to it ("the model is SO DANGEROUS only our RICHEST CUSTOMERS can have it!"), I believe this is just a small incremental step and entirely believable.

To summarize: "more efficient than all but the best" comes with too many qualifiers, but "are LLMs meaningfully useful in exercising vulnerabilities in OS kernel code," or "is it possible to accelerate vulnerability research and development with LLMs" - 100% absolutely.

And you don't have to believe one random professional (me); this opinion is fairly widespread across the community:

https://sockpuppet.org/blog/2026/03/30/vulnerability-researc...

https://lwn.net/Articles/1065620/

etc.

anthk•26 minutes ago

No meshnet for the people, because of surv^U security.

Create•about 1 hour ago

When LLM reports the bug, is should be used to fix it on the same occasion. Nobody will bother afterwards.

skeledrew•44 minutes ago

LLM being able to find bug doesn't necessarily equate to LLM being able to satisfactorily fix bug. Be happy that the bugs are being uncovered in the first place and brought to the attention of those who are concerned with their resolution.

staticassertion•about 3 hours ago

They can't maintain the code so they are no longer going to maintain the code.

traceroute66•about 3 hours ago

> They can't maintain the code so they are no longer going to maintain the code.

Yes, I don't see the point of maintaining technical debt just for the sake of it.

The security environment in 2026 is such that legacy unmaintained code is a very real security risk for obscure zero-days to exploit to gain a foot in the door.

Reading through the list I don't see it being an issue for the overwhelming majority of Linux users.

Who, for example, still uses ISDN in 2026 ? Most telcos have stopped all new sales and existing ISDN circuits will be forcefully disconnected within 3–5 years as the telcos complete their FTTP build-outs and the copper network is subsequently decomissioned.

devmor•about 1 hour ago

> Who, for example, still uses ISDN in 2026?

Most TV and radio stations.

traceroute66•about 1 hour ago

> Most TV and radio stations.

I doubt it. And as I said, telcos have ceased new sales of ISDN and will be shutting down copper networks within 3–5 years.

Therefore if there are still TV and radio stations still using it, they will be forced to stop using it by circumstance, i.e. they will find their ISDN will cease working after the telco shuts down the kit in the exchange.

sigmoid10•about 3 hours ago

Seems like this should have happened anyways and LLMs just finally forced them to admit it.

bastawhiz•about 2 hours ago

You're being downvoted but I think you're right in a lot of ways. If you read through the patches for some of the removals, the reasons come down to:

- Nobody is familiar with the code

- Almost all of the recent fixes are from static analysis

- Nobody is even sure if anyone uses the code

This feels a lot like CPython culling stdlib modules and making them pypi packages. The people who rely on those things have a little bit of extra work if they want a recent kernel version, and everyone else benefits (directly or indirectly) by way of there being less stuff that needs attention.

fluidcruft•about 3 hours ago

It's an interesting form of tree shaking.

The overlap of bugs being found, nobody caring enough to bother read the reports or fix the code, and nobody caring that the modules are pushed out of main seems good.

goalieca•about 2 hours ago

Maybe attackers would focus on these unused bits for very niche products, but generally no one would waste their time.

In general, drivers make up the largest attack surface in the kernel and many of them are just along for the ride rather than being actively maintained and reviewed by researchers.

catlifeonmars•about 1 hour ago

Would you say the vast majority are back seat drivers?

baq•about 2 hours ago

and the code is in the training set, so you can trivially[0] ask an LLM to summon it back either from memory or just by asking it to revert the removal commit.

[0] not trivially if you want to validate if it works

tristor•about 1 hour ago

Realistically, that list of components are mostly things that have not been used in modern computing devices for over a decade. Nothing prevents someone from providing a module from out of the kernel tree to ship these drivers or delivering some of these capabilities in user space, and if they are unused and unmaintained I would rather they're not shipped in the kernel.

Be real with yourself, do you know anyone using ISA or PCI in 2026? Everything is built on PCI-E except in specific industrial settings or on ancient hardware that's only relevant for retrocomputing. Is anyone using the ATM network protocol anymore? MPLS and MetroE mostly replaced ATM, and now MPLS is being largely supplanted by SDWAN technologies and normal Internet connections. I have been doing networking nearly my entire career in some capacity, the last time I touched X.25 or Frame Relay was in the early 2000s, the last time I touched ATM was in the mid early 2000s... the last time I touched ISDN was in the mid 2010s, and that was an IDSL setup, which is itself a dead technology. The last laptop I owned that had a PCMCIA card slot was manufactured in 2008.

I don't want to see these capabilities completely disappear, but there's no reason they should ship in the mainline kernel in 2026. They should be separated kernel modules in their own tree.

Krutonium•about 1 hour ago

I actually use a capture card on PCI but I'm well aware I'm unusual.

rasz•about 3 hours ago

Most if not all of the listed stuff could be converted to used mode code.

dbdr•about 2 hours ago

*user-mode code.

anthk•39 minutes ago

Damn it, HAM was always an asset and NOT just hamradio related, but other protocols such as some mesh network.

Can't wait to AI braindead folks get collapsed down for the good.