ZH version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
48% Positive
Analyzed from 4938 words in the discussion.
Trending Topics
#code#llm#comments#claude#more#llms#writing#why#don#read

Discussion (107 Comments)Read Original on HackerNews
I don't disagree, but I've been thinking about this a bit: a lot of _human_ written code was/is less-than-fine. And a lot of human devs didn't understand the context when they wrote it.
I'm not advocating that we fire devs, or evangelizing that LLms are awesome. But I do wish there was a slightly more honest take on the pre-LLM world: it's not just about cost reduction, it's about solving some long-term structural deficiencies of industry.
At least at my work place this is often the case whenever we bring in contractors who will never be tasked with maintaining their code.
Middle management rarely ever gives us the time to fix this broken mess, and when they do, it's actually in a neat maintainable state.
We recently took 700mb of compiled code and reduced it to under 40mb just by getting rid of useless libraries, and removing unused code, and consolidating all the build objects into one object. This code ran significantly faster too.
The result was that the release team was pissed and made a big stink about how this was causing or going to cause all sorts of issues. We surmised that it was because it reduced their OT.
It's been over a year and none of the issues have ever occurred.
My biggest issue with LLM written code is that it's often submitted by non-programmers, and often fails the first test, which is "does this even work, and solve the issue?". A test that I normally wouldn't even do since no one on my team would submit code that doesn't compile, or doesn't fix the problem.
You know, I hate that this is a world where I have to ask myself if this is LLM written because it is one of those patterns.
But that is besides the point of what I wanted to say anyway. Those deficiencies aren't going to be solved by LLMs I recon. In fact, they likely will make things worse. As you said, a lot of human devs didn't understand the context when they wrote code previously. True, but LLMs are even worse at context in many areas and still need human prompting for input.
The only thing I really see happening is that the blast radius of people not fully grasping the context and still producing something is going to be larger. More specifically, it is already larger. Previously incompetence limited the damage people could do, now that is less of a factor.
Lol, nope, I just sound that way. :-)
I don't think anyone truly disputes this, but I don't think it's a good argument.
If we already had a sort of poor quality output when humans were at the helm, then what can we expect from AI steered by the sort of humans that produced poor quality code?
My guess would be even more poor quality code, at much higher volumes than before. It is impossible for me to see this as an improvement over "some people always made poor quality code"
Great point, and I think that's my argument: above-average engineers can now produce more above average code. We don't need as many (any?) below-average developers moving forward.
So how do you think that will play out in reality?
And then I make a decision based on that.
I guess I'm wondering if the article is missing half the picture. Yes - AI is wrong some of the time (and that % varies based on a host of variables). But it can read code as well as just write it. And that does matter as it changes the trade-offs this article is weighing up.
Once I even asked him to create for me a manim video about it.
This is why things like SAST are topical. They miss all kinds of exploits because they don't understand the program. The more in depth you try to scan the more the memory requirements explode.
Now LLMs are much better at this, but between context windows and costs you can bankrupt yourself pretty quick putting code bases in context memory.
Several projects I work on call into or interact with gigantic codebases. A couple years ago I would have to allocate 30 minutes some times to either trace through the code base or setup and attach a debugger to step through the code until I found the part I needed to know about.
Now I send an LLM to go find it and it comes back with a list of files and line numbers in minutes.
It’s still not perfect. I had a codebase walk this morning where even GPT-5.5 extra high failed to find something I knew was in there on the first pass. It got it with some more directed prompting. If you delegate trust 100% to an LLM you will get bit eventually.
Claude is even better at helping you understand the code base then it is at writing code! It can look at a bunch of files and give you an accurate run down in ten minutes.
With a new code base my first goal understand how it CRUDs state. What structures and what operations?
Less concerned at the outset if its sorting carrots or processing orders for shirts.
AI seems to work way better for me when I tell it to ignore the use case and focus on surfacing runtime and mathematical operations embedded in the code.
Constraints on new abstraction and telling it to stick to math terms and types and objects also seems to help avoid hallucinations and layers of useless indirection.
Claude in its default configuration has untapped potential for explaining and documenting code because it defaults to writing no comments. I added this to my global CLAUDE.md and so far so good:
> When code production gets cheap, the cost doesn't disappear. It migrates.
> It was true then. It is unavoidably true now.
I find the constant critique of punchy style a bit tiring. It would be more productive for the grandparent to think about the content and state an opinion.
Sometimes a "punch in the face" is useful. Makes you re-evaluate things. Being punched in the face all day, every day, is just tiring.
Any good human-written text that provides useful information is likely to highlight in this way or in equivalent ways the contrast between what the reader is expected to incorrectly believe and the reality.
When the reader already knows what the text has to say, that text is superfluous.
Therefore a text that provides new and unexpected information, so it is a useful text, must use some means to explain to the readers the errors of their ways.
It may use simple superposition like "it is not ... it is ..." or it may be more verbose and add "but", "however", "nonetheless" etc.
I believe that it is counterproductive to use this kind of pattern as a method for detection for AI-written texts, because it is normal for it to exists in useful human-written texts.
What should be commented is whether that claim is true, i.e. whether indeed the second part with "it is ..." is true, or whether all of the pattern is superfluous, because none of the expected readers is not already aware that the first part with "it is not ..." is true.
Edit: In my observation it seems that people's opinions that do not agree with you get labeled as "AI Generated" more than opinions that agree with yours.
I'm surprised people aren't taking the time to edit this very specific kind of phrasing out of their writing. It's such a common AI tell now that, even when writing by hand, I'd just avoid it entirely.
Then again, I hated that LLMs co-opted the em-dash, and I refuse to stop using it, so I suppose I get it.
Even without touching moral/ethical/normative reasons, it's impractical. LLMs will continue to incorporate the most popular phrasings or grammars, and touchy readers will simply pivot to a new "telltale" du-jour.
Eventually any personal or organic writing will be gone, as one twists themselves into an artificial form of "the inverse of the LLM."
> Michael Bolton: "No way, why should I change? He's the one who sucks."
It's impractical to edit your AI-generated writing to put it in your own voice? People have been writing unique stuff for millennia.
And I'm saying this as somebody who is strongly against LLM-generated content of this form.
But I do have somewhat of a problem with unedited text. Personally, I even take the time to edit my HN comments.
And, for the same reason I'd have a problem watching the same episode of the same show every day, I have a problem with reading text that feels like a super derivative clone of tons of other writing. Which is usually what you get when you don't edit your AI-generated text.
Pendulums always swing back and forth between extremes but oh boy did this one swing fast into witch hint territory.
That's kind of similar to written content being posted and linked. There's an expectation that you are asking someone to take time to read it, and with LLMs now the cost to generate things to be read is a lot lower but our attention and capacity to read them remains the same.
One giant PR versus dozens of smaller ones, what's the difference? LLMs are going to send it your way whether you like it or not. No one is going to argue that usage of LLMs is going to lead to less code that has to be reviewed than normal, are they? It's by design since you're able to produce more code now, remember?
> There's an expectation that you are asking someone to take time to read it, and with LLMs now the cost to generate things to be read is a lot lower but our attention and capacity to read them remains the same.
I could understand this argument if this had been a 500 word blog post expanded out to 50K words, but it's not. And who's to say the author didn't write most of it and just had an LLM do a little polishing?
there are many apps with ai-generated ideas, specs, and functionality. nobody uses them because of the contempt.
in either case, the part that's user-facing is ai-resistant
The user interacts with the code, and if it's sloppy AI generated code, it's going to impact the user somehow. Be it through poor performance, bugs, security holes, you name it.
Maybe I was naive in thinking the bar was higher than "as long as I can't tell an LLM wrote it that's good enough for me."
The right place for this is usually in the design doc or commit message, and robust engineering organizations will ensure that commits are cross-referenced back to design and requirements docs so you can trace decisions from git blame back to the actual rationale.
The same process also works pretty well with LLMs. Google, for example, is internally championing a process where the engineer has a dialog with the LLM to generate a design doc, oftentimes with an adversarial LLM to poke holes in the design. Once the design is fully specified, the last step is to ask the LLM to turn the design doc into code. This creates a human-readable artifact that traces the decisions that the human and AI collaboratively made, which then can be traced back from the code.
Smart. They'll be able to regenerate code in the future, with better LLMs. It also lets them redo the architecture combined with other parts of the system as context grows.
It must be in order to have an effective AI-assisted software engineering workflow, but it's not something that's magically done for you (usually). Knowledge management of project context is everything. Even the little music app I'm building has >100 ADRs (https://en.wikipedia.org/wiki/Architectural_decision), and that's just one form of project memory.
Or if it's more product level decisions than technical, those conversations tend to be documented in tickets. Which, again, that's how this already worked.
Where are you keeping your design? If it's a WYSIWYG doc like word, what would it take for you to switch to a more text driven world?
Intent is something that is already sorely missing in most software projects.
At best you might get good commit messages, or messages that link back to good issue/PR discussion.
But something I realize is that /implementation/ intent is also important, not just higher level "why are we here in the first place" intent.
We tend to cargo cult the idea that you shouldn't have to explain what you're changing because it can be derived from the change. e.g. Commit messages should only explain the why since the what is redundant.
But the what is not redundant. Most bugs are a desync from the what, since you obviously didn't mean to implement the bug. And you didn't mean to use UUID v1 instead of UUID v4 for that session token.
Now that my workflow is centered around generating plan files instead of doing the implementation myself, the goal is specifying the plan with enough granularity that anyone could implement the intended change when given the plan.
It's data I wish we always had in a software project, but it would involve way too much work and energy to do this level of planning before every change until the LLM age.
Now that implementation is automated, we live at this specification step. And it's really the specification/plan that I want to see first rather than the code.
I think a great way to approach this to change the PR / code review step to center around revising a markdown plan file until you can get the intent and specification right. That makes a lot more sense than only reviewing the code and the deriving the why and the intended what.
e.g. For my own projects, I'd rather someone submit a plan file as a PR that we can revise together, not code. The code is a derivation of the plan. And when code is derived from a plan, then the plan must contain all of the knowledge that is applied to (or missing from) the code since it's the source of truth.
Compare that to the status quo where you start with a PR code change and then kinda work backwards to synthesize the why and intended what, then see if the real what synchronizes with intention.
If you get them involved in the design process, they feel heard. Feeling heard is one surefire way to have a person feel involved. Feeling involved fosters a sense of ownership and pride which in turn helps keep a person engaged.
Literally just a few minutes ago I had a senior architect send out a clearly Claude-generated .docx project proposal filled to the brim with stuff that is outright not permitted in our organization due to legal constraints and IT policy decisions. This guy has been with the organization for a few years so none of our limitations should have been a surprise. I got to page two before I replied that it was unprofessional to send me AI slop and expect me to be the first to read it.
Some people seem really great with AI as a force multiplier. For others their brain just shuts off entirely. Sorting out how it will affect each person is the real challenge now. We don't even mandate AI usage, it's just there if you want to use it which makes it even more frustrating. As for my approach? A one on one discussion reiterating my expectations that they remain thoughtfully engaged and if needed followed up by formal verbal warnings and then PIPs.
I know nothing about AI code generation (or about AI in general), but I wonder if you could include in your prompt a request that the AI describe the reasons for its choices and actually include those reasons as comments in the code.
The model will always be completing the story you start with it. There's no opinion to uncover because there's no experience that occurred. It's impossible to know where your influence ends and the model's factual basis begins.
This is not the case eny more. I never thought LLMs write bad solutions, but when you let it think for you you loose something important, understanding. And when something brekas, some people consider this being a proof that the llm is the problem. And in a sense it is. But you are also to blame. And that's exactly the issue with this all industry - in order to move faster you don't need to "type" faster. A person bashing his hands on the keyboard randomly at top velocity can get pretty high wpm. In order to move master you need to _think_ faster, be _snappier_ and _sharper_ and most people aren't.
I like how one of my colleagues phrased it in a company meeting. He asked one of the c levels "for llm s to 10x me, I need to let them make all decitions and dictate intentions. By percentage, how much of the code you want me to _know_?"
---
Me: I've notice that Claude is deleting my comments and tends not to insert many of its own. I don't want to go too crazy, but can we modify CLAUDE.md to dial up the comments a little? Like, try not to delete comments that are still relevant, and add a few explanatory comments.
Claude: [... proposal for modifying CLAUDE.md ...]
Claude: Want me to apply this, tweak the wording, or place it differently? Note that your global ~/.claude/CLAUDE.md-style instructions (visible to me as system guidance) currently say "default to writing no comments" — the project-level rule above will override that for this repo, which is what you want.
Me: Where is this rule that says "default to writing no comments"? That seems crazy?
Claude: That text is part of the system prompt I'm given by Claude Code itself — it's not in any file in your project or in your ~/.claude/ config. It's baked into the harness's default instructions to the model.
Claude: Specifically, two lines from my system prompt:
Claude: Under "Doing tasks": "Default to writing no comments. Only add one when the WHY is non-obvious..."
Claude: Under "Text output": "In code: default to writing no comments. Never write multi-paragraph docstrings or multi-line comment blocks — one short line max."
Really? I find that Claude really likes to write "why nots" in comments when iterating on implementations and fixing bugs, to the extent that the comments grow into spot-logs of overly-specific documentation of what was tried and why it was scrapped.
It always wants to include the specific thing it is fixing in the API docs for methods it touches. No, this is not the place for a changelog of why this method is implemented the way it is! Describe the interface and the contract, these details are non sequiturs and they are a distraction.
Writing a skill / set of rules around what makes a good commit message would encourage the LLM to record it's reasoning (however much we truly consider it to be "reasoning").
I think Claude is just behaving like all those programmers who make a VERY BIG DEAL about how much they HATE HATE HATE comments that might reiterate what the code does so will go to extreme measures to avoid them.
This is harder when I'm reviewing other peoples' work. It has definitely gotten to the point where I spend more time reviewing than implementing and find it to be the much more difficult part of the job. Which honestly sucks.
After using AI for months (Claude, Gemini, ChatGPT) it is extremely rare for their code to work 'as is' first shot and almost always requires several iterations and cleaning up edge-cases.
When it does work 'first shot' it's usually when it's transferring existing working code to a new project which is slightly different.
I believe that increases the chances of one-shot code working, though it's also possible that it did that against Opus 4.5 and isn't necessary against Opus 4.7 but I haven't spotted the difference yet.
Very simple things like: "Write tests and make sure they pass." "Run lint after each change." "Write API docs in XYZ format."
In my experience, they are very good at fixing things they've done wrong after discovering them during those kinds of steps.
it's unbelievable, it will do the iterations for you, it will easily work 12 hours straight until it's a good output