AI Built a Nuke and Still Lost

kkensai about 3 hours ago 73 commentsRead Article on lwilko.com

FR version is available. Content is displayed in original English for accuracy.

⚡ Community Insights

Discussion Sentiment

56% Positive

Analyzed from 2926 words in the discussion.

Discussion (73 Comments)Read Original on HackerNews

fyredge•about 2 hours ago

There is something to be said about the qualia of LLM generated passages. Each individual sentence reads as a statement and every next statement a continuation of the previous one. This happened, then this happened... Ad infinitum.

Before today, I could not explain to you why AI articles were so obvious to me, but I think I do now. There is no insight to be gleamed. Pre-LLM, authors generally had intention behind their words. The final product might not adequately reflect their thoughts, but word selection would expose it somewhat. With LLMs, sentences flow seamlessly from word to word, but the intention is nowhere to be found. Things happened and more things happened, to what end?

roenxi•14 minutes ago

I think it might be a training artefact of some sort - the current crops of LLMs have never been in a position where they can explore the world as an independent existence and so they might be struggling to model how to explain an interesting experience? The Go AIs had problems with ladders of all things (one of the most basic beginner shapes) back in the early superhuman phases after Alphago. There seems to be some similar and profound gap in the LLM understanding of how to communicate when storytelling.

The "But France was running two clocks at once" paragraph really set me off because I get the feeling something really interesting might be happening that the AI doesn't want to talk about and there is evidence that it is trying to say something. But the result is some amount of gibberish and some amount of vague allusion to something interesting in the prompt context while glossing over all the information that might matter while working hard to create an evocative feeling that isn't interesting. A tense atmosphere with no exploration of why there is tension.

wmwragg•about 1 hour ago

This and the fact that you often read a sentence, paragraph or the whole article, and think this said absolutely nothing in lots of words.

dspillett•23 minutes ago

That is also true of a lot of pre-2023, so most likely human penned, writing.

LLMs seem to emulate bad (or even "meh") writing well but, without a human editor making significant tweaks, have yet to excel at good writing.

I've been incorrectly identified as an LLM before now because my writing is sometimes bad and falls into the tropes now associated with generative AI (“not this, but that”, being overly wordy, appearing to lack focus, etc).

pjio•about 1 hour ago

For a limited amount of time I appreciated the level of detail in the article, hoping it would give me more insight, until it exhausted me. I think those two ideas are real takeaways: "Knowing is not doing" and "What can we trust AI to do?". Still, could have been said with a more concise text and maybe a follow up about the details.

artpar•8 minutes ago

The key takeaway pretty much applies to the authoring of the article itself. The LLM knew what all happened but couldn't put it into a readable article.

himata4113•26 minutes ago

This problem actually surfaces in movies too, for A to happen B has to happen, but B has no reason to happen so you end up with non sensensical situations. This happens in llms as well since A is explained by B happening, but A doesn't need to be explained since A can't happen.

sph•about 1 hour ago

> There is no insight to be gleamed.

AI-generated articles are the intellectual equivalent of empty calories.

I have just spent the last 10 minutes trying to figure out why someone decided to buy imgui.org, name-squatting an actual project, just to put a slop website on it mildly referencing the original project. It's not even trying to scam you.

I keep wondering whether these people that keep polluting the internet with their insightless slop even possess self-awareness. What motivates them to expend money and effort to contribute nothing to the world? Are they another example of a philosophical zombie?

goatherders•36 minutes ago

The answer is in your question. "Empty calories" is a multi trillion dollar business in the food world. It will be the same in the digital world.

soco•about 1 hour ago

I cannot tell what about domain squatting, but I've seen a "why" in seemingly innocuous Facebook groups about baking or such, which at the right time slowly transitioned to fake AI pictures and stories, and then to straight-out political propaganda. I'm talking Eastern Europe and russian "special operation" support propaganda. But a slop website won't have enough traffic to be worth such an action, so no idea.

ramon156•about 2 hours ago

It's weird because when you look at models that expose CoT, this does not happen. They switch up every second.

"But then X happened... Wait, didn't Y happen? Then why would X be there? I think the user's initial statement was correct, but then Y happened..."

dinfinity•27 minutes ago

> Pre-LLM, authors generally had intention behind their words.

I think this is at least in part a combination of rosy retrospection and attentional bias: A lot of human writing was always trash. Absolute dogshit with regard to the quality of writing, but there was no "AI slop" label to attach to it. How would you, pre-LLMs, have placed a comment on the writing style if a post was badly written? From what I've seen it would be a "this is marketing/SEO-speak" or some similar comment, deriding the author for being uninformed or of ill intent.

We've now become so allergic to AI slop that anything that even smells like it triggers almost immediate disgust and attachment of that label to the content (even if it is the same old human written trash).

I guess LLM-assisted posts do change the dynamic a bit: the intent is more often benign with a desire to write something good, but the skill to do so lacking. If we limit the "pre-LLM authors" to people with good intent writing about stuff relevant to a HackerNews audience, you're probably right. Many more bad writers are now creating the same ostensibly fancy articles, decreasing the signal-to-noise ratio we were used.

teekert•about 2 hours ago

It's not this, it's that. And then what happened? This. I did that... This happened.

    It's a thing
    I don't know why
    But it's a thing

To be honest, it's not a thing.

Let that sink in.

Maybe we find most meaning in the least average language constructs.

doppioandante•about 1 hour ago

I came to the same conclusion about AI generated code. When I read code written by a human, just by skimming it, I can get a sense of what purpose the code has, why it was written this way and not another way, what style and mindset the programmer behind it has. AI generated code may sometimes be extremely precise and following all the good practices, but I feel no intent behind it.

titanomachy•29 minutes ago

It's surprising to me that the big labs haven't fixed this problem. Half the comments here are complaining that this article is egregious unreadable slop, and I agree. Surely with the trillions of investment they could at least figure out how to vary sentence structure a bit and nix obvious tells.

Maybe this is just an inherent problem with LLMs.

xpct•17 minutes ago

The following come to mind:

- either the problem is hard, or labs have no incentive to fix it for their main users

- being able to tell that something is LLM-generated, is good

- could be that structure is an emergent property as models get better

scotty79•about 1 hour ago

For me this reads like a report of things that were tried and observed. It was a very pleasant read for me because I'm interested in the subject. And the lack of underlying agenda, moral lesson, politics or, as you call it, insight, was quite refreshing. I became quite allergic to texts where author clearly tries to make me think a specific thing. To sell me something. I usually find the agenda pretty quickly and I know the rest of text is just a fluff around it so I lose interest. And when the agenda is not easy to find then I just get more annoyed because I feel it's intentionally hidden. Like a solution to a clickbait title.

This text reads great for me because as I read it, I clearly saw there's no agenda so I felt safe to just absorb the information that it contains.

mrmarket•about 1 hour ago

well, this is a first. never seen someone say they prefer AI slop even when they know it's slop. fascinating.

scotty79•about 1 hour ago

I tend to steer clear of the largest herd in many aspects. Often unintentionally. Also I'm not a native speaker so I might be not as receptive to some of the things that offend others in AI generated content.

Maybe AI is sort of anti-trump, where it's viscerally unbearable for native speakers even if the content is good, opposite to trump speech that somehow seems viscerally appealing to native speakers even though the content is complete garbage.

shevy-java•about 1 hour ago

> There is no insight to be gleamed.

This is no surprise. AI slop is called slop for a reason. It is basically just spam-slop. The whole term "Artificial Intelligence" has always been a misnomer from the get go, stealing from biological systems without understanding them, yet alone being able to re-create them via non-biological means. Even synthetic biology, as cool as it is, has huge limitations e. g. leaky promoters (or CRISPR-Cas off-target cleavage, which is a major reason why gene therapy isn't yet there, despite the occasional promo article of how xyz has been totally cured forever).

What I don't understand is that people can find it useful. I understand some of the rationale, but I find AI slop just aims to try to steal my time. I can not tolerate this.

neonstatic•about 2 hours ago

That's an interesting observation. For me the main takeaway is still the style.

(bigheading)The takeaway(/bigheading) The style? Terrible.

threatripper•about 1 hour ago

Sorry, but this sounds exactly like a greentext you can read on 4claw. Are you a real human?

darkwi11ow•4 minutes ago

LLMs are really bad at abstract strategy games like chess, go or civilization. Their ability to excel at broad reasoning is what is limiting them in games that have narrow rule-sets but steep learning curve.

pjc50•about 2 hours ago

> I now work with governments around the world at the Tony Blair Institute, which means I spend a lot of time in rooms where people ask the same question: what can we actually trust these systems to do?

Oh no - we're going to end up with the Starmerbot 3000.

Now I've got the joke out of the way, there's at least four interesting lines of inquiry one could take with this blog post:

- teaching the AI how to play Civilization

- to what extent does this result in "transferable skills", either AI or human? Is this the right game (qv SimCity etc)?

- issues of visibility; "seeing like a state" becomes very literal here. The AI can only make decisions on things it knows about. What are the limits of that when trying to do politics only from statistical information? Should we be referencing Stafford Beer here?

- (at the risk of tripping your AI detector here): modern politics is not so much left vs right as "technocratic wonk" vs "blood and soil". The wonks have comprehensively lost in public opinion. Creating a better wonk is not going to help until there is demand for that kind of politics.

If there ever is a US-China war, it will not be in search of more victory points to meet a win condition, it will be like the Russia-Ukraine war: one guy (on either side!) decides to make hundreds of millions of people worse off out of sheer greed.

xpct•12 minutes ago

I very much dislike the idea of teaching the robot to play Civilization and expect those skills to transfer to their advisory nature.

If anything, I'd almost prefer a leader who hasn't played Civilization in their life. Goes without saying that a mature leader could tell these apart, but in this day and age, I'm not so sure whether everyone could.

Planktonne•about 2 hours ago

> "technocratic wonk" vs "blood and soil"

This is not a binary; it's the same people on the same side.

pjc50•about 1 hour ago

No, it very much isn't, although obviously the Kissingers of the world want to pretend that they're in the first category of clear-eyed utility maximising rationalists while they're actually in the second.

That doesn't mean that rational policy planning has never been a thing. The EU while imperfect and frustrating is explicitly orientated towards technocratic consensus rather than the mid-20th-century Europe of nationalist mass murder. Only a tiny number of people think that Von der Leyen and Hitler are equivalent.

(or rather, if you think technocrats and blood-and-soil are the same side, what do you call the "other" side?)

Planktonne•about 1 hour ago

I think we're talking at cross-purposes here. I wouldn't describe the EU as technocratic at all; I'd reserve that label for the people who self-describe as the logical ones--"clear-eyed utility maximising rationalists" as you say--while pushing endlessly for more technology, less regulation and (pretty consistently) hawkish and nationalistic policies. That's very much not the EU.

I don't disagree that there are different approaches in conflict, but the binary of forward-looking technologists vs backward-looking nationalists is very out-of-date.

ahartmetz•about 1 hour ago

"Tony Blair Institute" fits right into the "x word horror" Xitter genre. Funded by Larry Ellison to boot!

Tony Blair is the guy who found success by making the UK's left-leaning party (much) more neoliberal and was promptly imitated by Gerhard Schröder in Germany doing basically the same thing. Schröder is also BFF with Putin.

dwroberts•16 minutes ago

> I asked the agent what this was actually like for it. It wrote back

Stuff like this just makes the author seem clueless. What is even the function of putting a question like that into an LLM unless you’re already hopelessly in anthropomorphic territory

NoLinkToMe•33 minutes ago

Quite annoying to have to read a paragraph of text next to a moving image. I right-clicked every GIF and turned off 'loop'.

Beyond that reading an AI piece just feels like a waste of time. The text goes on and on without making a point, or getting to an actual learning. It just delineates the AI's limitations, doesn't go into whether these can be fixed, are innate, or what conclusions you can draw from it, over and over with example after example but no point.

Mostly it seems to keep repeating that the AI has the correct analysis but just doesn't execute. The AI knows to build X and logs this in each of its turns, yet doesn't build it. It's like there's some API connection missing between analysis and execution, and turns this into a 10 page article.

The article ends with some weird question to the AI asking if it enjoys the games, and you get some quasi-scifi mumbo jumbo answer back that looks very profound to say my mom, but is just silly to post if you know what the LLM is doing: predicting the next word. Honestly this is a poor article and I wish it wasn't posted.

mrmarket•about 1 hour ago

why have a blog if you're going to just use AI for everything? at that point, just do twitter threads or something. that way you can tweet out whatever you prompted the model with. if you're not suited for long-form writing that's fine, just use a medium that favors short-form writing.

dspillett•27 minutes ago

Did no one think of offering it a nice game of chess?

indigovole•about 2 hours ago

Even with his context-tracking mechanism, the gameplay failures sound like running out of context in the late game, especially the frequent failures of the "check for opponent win conditions every 20 moves." Wondering how much info about the game win state gets captured in the game digests, and how much he could improve the gameplay even with the MCP limitations by focusing there.

jetbalsa•about 2 hours ago

I also noticed they where not using XML for game state output, from what I understand most LLMs still benefit from having outputs like this put into XML tags

Mikhail_K•about 1 hour ago

> It had one option left. It built two nuclear devices and levelled Toulouse.

Of course it did, its designer worked for Tony Blair institute.

teekert•about 2 hours ago

Well, the weird thing with nukes is that deterrence only works if you are 100% ready to use them. When the time comes though it would certainly be nice if it turned out to be below 100%.

What is winning? Are we a collective or are we individuals?

Likely the AI did not get the assignment That "Whatever happens, humans as a race must survive."

throwawayqqq11•about 1 hour ago

Im sure there are some billionaires to find, that finally care about the survival of the white race. /s

majorbugger•about 2 hours ago

> Somewhere in the first game, between a bug fix and a strategy note, I asked the agent what this was actually like for it

Yeah because LLM "experiences" the game

fragmede•about 1 hour ago

What word would you use instead?

phyalow•about 1 hour ago

Ai;dr

voidUpdate•about 2 hours ago

Well this looks like a perfect example of why an LLM should never make any governmental decisions ever

j5dgx76•about 2 hours ago

> Tony Blair Institute

Okay carry on.

BoxOfRain•about 1 hour ago

There's something so uncanny about the mismatch between the regard in which Blair is generally held by British people and the regard in which he seems to hold himself.

If I were him I'd have retired from public life and kept a very low profile after Iraq, and everything else for that matter. He doesn't seem to realise that his modern interventions alienate everyone, even Alastair Campbell of all people seemed uncomfortable to the degree he seems to uncritically sing the praises of people like Larry Ellison recently.

orthoxerox•about 2 hours ago

Chumbawamba made me unable to take anything associated with him seriously.

petesergeant•about 2 hours ago

He was arguably the most successful UK PM of the last 50 years.

pjc50•30 minutes ago

I think I could agree with that, until the Iraq war.

Obscurity4340•about 1 hour ago

By what metric(s)?

ForHackernews•about 2 hours ago

Kind of grim that this level of analysis is informing UK government policy. Repeatedly, the AI doesn't have the information or access needed through his hacky vibe-coded MCP, and instead of abandoning his flawed artificial test scenario (or fixing it — finding or building a better one) he gives it a name "The sensorium effect" and treats this as some brilliant insight.

Both humans and AI struggle to make sound choices when presented with incomplete or misleading information. This is not a new revelation: https://en.wikipedia.org/wiki/There_are_unknown_unknowns

NoLinkToMe•21 minutes ago

Exactly this, he should've just fixed this, or not written an article about it.

After the 'sensorium effect' (he should've used ancient greek for a +10 bonus to archaic intellectual points), he describes the 'knowledge-doing gap'. i.e. the AI reasons it needs to build X, logs this for 110 turns in a row, but doesn't do it. It doesn't actually specify why not, and whether it is again a limitation of his MCP implementation. If the AI articulates it must do it like the author says, but decides not to, either it doesn't think it must do it, or it does think it must but somehow can't technically execute its own decisions, it can't be anything else.

In fact in the context of 'advising the UK government', this 'knowledge-doing gap' I assume is a technical limitation, is entirely moot. For the cost of 0.00001% of the UK's government you could just hire a human being to execute that which the AI articulates. I'm curious what the results would be if he just did a manual execution of the AI's articulated actions would be.

The fact he doesn't go in to this but just keeps repeating examples of this makes it a pointless article.

pjc50•about 2 hours ago

> he gives it a name "The sensorium effect" and treats this as some brilliant insight

And of course is unaware of prior work in this area!

https://en.wikipedia.org/wiki/Seeing_Like_a_State / https://en.wikipedia.org/wiki/Project_Cybersyn

raincole•about 1 hour ago

> he gives it a name

It gives it a name. It would be quite surprising if he bothered to come up with this name himself when the whole article is obviously AI written.

anygivnthursday•about 2 hours ago

I have a hard time reading slop, but I like the game and wanted to know how it worked, so fought my way through, only skipped the very last part. The issue the author calls out is classic Claude (I dont really use other LLMs to compare), probably all of us experienced using Claude Code when it gets so focused on one thing it misses the forest for the tree. It happens often, even if it does verify something and it shows something is wrong, it sometimes rationalizes it and explains it away when it does not fit its model.

Havoc•about 2 hours ago

Guessing it has a fair bit of civilisation and similar war games in its training data

blitzar•about 1 hour ago

They should have built the Strait of Hormuz ... easy victory then.

StrauXX•about 2 hours ago

This reads to me mostly like the MCP server has many bugs, rather than inherent model weaknesses.

jmyeet•about 1 hour ago

Computer game studios love player vs player ("pvp") games. Why? Because user-generated content is cheap and the ideal goal is an endless loop of players coming back. This is the motivating factor behidn games like Call of Duty, Battlefield, Fortnite, etc.

MMORPG publishers keep trying to do this as well. World of Warcraft has spent 20 years trying to push open world pvp. Every WoW challenger has always claimed they would have the best pvp ever. They want that cheap, endless gameplay loop. But it never works. Open world pvp tursn into ganking (ie killing much weaker players by ambushing them and/or ganging up on people). The ganked end up leaving the game in droves. Games try to balance this out by "punishing" gankers with reputation hits or not being able to go to town or whatever. And none of those disincentives work.

The reason pvp doesn't work in a persistent world like an MMORPG is because there are no stakes. If you die, you just come back to life or make a new character. Obviously real life doesn't work that way.

I really wonder if that's the problem with AIs going off the rails and committing heinous crimes in their sandboxes (like nuking Toulouse here). The AI just has no sense of self or self-preservation. There's also empathy. The AI can't see itself as a potential victim of nuclear war and understand all that entails.

smw•about 1 hour ago

> The reason pvp doesn't work in a persistent world like an MMORPG is because there are no stakes.

See Eve Online

Planktonne•about 2 hours ago

Another article about how it's dangerous to trust AI, written by AI. I don't understand how people don't realise how much this undermines the message.

jagged-chisel•about 2 hours ago

Undermines. Underscores.

Matters of perspective.

petesergeant•about 2 hours ago

> how much this undermines the message

It didn’t undermine it for me.

Planktonne•about 1 hour ago

I'm not talking about perception of the message, which will vary with the reader, but about sincerity of the message, which is determined by the writer.

dude250711•about 2 hours ago

Do we have to surround a fancy predictive autocomplete with AI mysticism?

joxdosba•about 2 hours ago

Posting meaningless AI generated nonsense as original text paints a very damning picture of the intellectual abilities of the person behind this blog.

And doing so without a giant [SLOP WARNING] at the top is an asshole move, a decent person would never do so.

alper•about 2 hours ago

"Global Thermonuclear War"