HI version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
64% Positive
Analyzed from 2665 words in the discussion.
Trending Topics
#more#code#review#agents#getting#job#output#management#using#care

Discussion (58 Comments)Read Original on HackerNews
The result might be more faulty code getting merged, but if you already have outages and can't review every PR, is there currently a meaningful benefit to the PR workflow?
We know some people are using LLMs to evaluate PRs, the only question is who, and how strong the incentive is for them to give up.
LLM-generated code feels the same. Reviewing LLM-generated code when it's in the context of a monolith is more taxing than reviewing it in the context of the microservice; the blast radius is larger and the risk is greater, as you can make decisions around how important that service actually is for system-wide stability with microservices. You can effectively not care for some services, and can go back and iterate or rewrite it several times over. But more importantly, the organizational structures that are needed to support microservice like architectures effectively also feel like the organizational structures that are needed to support LLM-generated codebases effectively; more silo-ing, more ownership, more contract and spec-based communication between teams, etc. Teams might become one person and an agent in that org structure. But communication and responsibilities feel like they're require something similar to what is needed to support microservices...just that services are probably closer in size to what many companies end up building when they try to build microservices.
And then there are majestic monoliths, very well curated monoliths that feel like a monorepo of services with clear design and architecture. If they've been well managed, these are also likely to work well for agents, but still suffer the same cognitive overhead when reviewing their work because organizationally people working on or reviewing code for these projects are often still responsible for more than just a narrow slice, with a lot of overlap with other devs, requiring more eyes and buy-in for each change as a result.
The organizational structures that we have in place for today might be forced to adapt over time, to silo in ways that ownership and responsibility narrow to fit within what we can juggle mentally. Or they'll be forced to slow down an accept the limitations of the organizational structure. Personal projects have been the area that people have had a lot of success with for LLMs, which feels closer to smaller siloed teams. Open-source collaboration with LLM PRs feels like it falls apart for the same cognitive overhead reasons as existing team structures that adopt AI.
My current workplace is going through a major "realignment" exercise to replace as many testers with agents as humanely possible, which proved to be a challenge when the existing process is not well documented.
Edit: to clarify, I know these models have gotten significantly better. The output is pretty incredible sometimes, but trusting it end to end like that just seems super risky still.
LLMs can't be responsible for deciding what code you use because they have no skin in the game. They don't even have skin.
If you type fast, well then it takes just as long to code it yourself as review it. Plus you actually get flow time when you're coding.
For heaven's sake people have the robot write your unit tests and dashboards, not your production code. Otherwise delete yourself.
It's starting to just feel a little like an excuse to call everyone on deck for "a few weeks trying 9-9-6". But even then the lack of traction isn't between the eyeballs and the deployment. You'll still be spinning wheels in that slippery stuff between what a customer is thinking and what the iron they bought is doing.
There’s no way velocity will decrease now that upper management is obsessed with AI.
I highly doubt there are any managers or executives who care how AI is precisely used as long as there are positive results. I would argue that this is indeed an engineering problem, not an upper management one.
What's missing is a realistic discussion about this problem online. We instead see insanely reckless people bragging about how fast they drove their pile of shit startup directly into the ground, or people in denial loudly banging drums to resist all forms of AI.
You don’t even have to code the linters yourself. The agent can write a python script that walks the AST of the code, or uses regex, or tries to run it or compile it. Non zero exit code and a line number and the agent will fix the problem then and rerun the linter and loop until it passes.
Lint your architecture - block any commit which directly imports the database from a route handler. Whatever the coding agent thinks - ask it for recommendations for an approach!
Get out of the business of low level code review. That stuff is automatable and codifiable and it’s not where you are best poised to add value, dear human.
I do see “task expansion” happening often though. If I can do the full feature rather than doing baby steps I’ll often do that now, because wrangling code is easier.
(decimate had specific literal intent. Now it's just a force modifier like bigly)
feels euphemistic for the original “colloquial” usage I have for it.
> The killing of one in ten, chosen by lots, from a rebellious city or a mutinous army was a punishment sometimes used by the Romans. The word has been used (loosely and unetymologically, to the irritation of pedants) since 1660s for "destroy a large but indefinite number of." [0]
[0] https://www.etymonline.com/word/decimate
Working more as a pair, or essentially doing code review as you go, in small chunks, is significantly better.
I personally don't have the setup of tokens to spend to say "go build this entire thing" and then review 15k loc. I also find even opus is poor at coming up with tests to justify the business logic it's meant to be implementing.
Using Codex 5.2
We tend to use Opus 4.6 High and GPT 5.4 High.
There’s no secret into how people are getting “10x”, or at least claiming to, they’re just working more.
In my scientific computing environment, the majority of my vibe coded output goes to one-off scripts, stuff that is not worth committing (correcting outputs, one-off visualizations, consistency checks), and anything worth committing gets further refined to an extent that it pretty much can't be considered vibe coded anymore. It's simply too risky, any bugs would propagate down to decision making for designing new, expensive instruments.
I imagine that the cost and trust risks in enterprise environments are similar, so this seems very reckless.
AI Agents have helped up my productivity, but that's specifically because I can focus on the science, and delegate the auxiliary things to AI. I also believe I get this productivity out of them because my supervisor really drove home how hard I need to go on consistency checks and years of having my visualizations nitpicked (so I am able to do the same to AI and recognize when results are suspicious).
You then continue to vibe code as instructed by management. No burnout because you are not responsible anymore.
IMHO you just need two stacks -- systems where you can play fast and loose and 10x output. And systems where quality matters where you can perhaps 1.5 or 2x. That is still a lot of output.
Just the past weekend, I was talking with a very senior engineer (~distinguished engineer at a very large tech co) who basically said he's working 8-8-6 (8 am - 8 pm, 6 days/week), "writing code" (more like supervising 8-15 agents) for a product demo in 2 weeks, which otherwise would have taken at least 1 quarter's worth of time with a small team. He's zonked out, fwiw. There are no junior engineers in the team ¯\_(ツ)_/¯, most having been laid off a few months ago.
The toll it takes, and the expectations of AI-driven productivity, have only increased dramatically. At some point, the reality will hit the remaining engg team. Not sure if the company or its leadership realizes, but so far, it's all-AI, all-the-time, human cost of productivity be damned.
How do they do it? (My own record is 5 agents, but it is not typical). Do they use gastown or something?
Gotta have really good test harnesses so they can largely fix themselves.
He's one (small) step from distinguished engineer, with 20+ patents to his name, and is an embedded programmer (largely C/C++) with 30+ years of experience in the field; and I've known him for nearly as long, so I put a lot of credence to his words.
But we don't usually talk work; he's the guitarist in our band :) [I'm the bass] So we mainly chill over music + beer. And lately, it's been less chill ¯\_(ツ)_/¯
The question is can you tolerate the amount of PRs thrown at you per day on top of reviewing the exponentially growing mess of code that continues to double every hour and being paid less for it.
Just learn to say no and leave. Why do you tolerate the increasing comprehension debt that is loaded on to you.
You will never get that time back. Just give it to someone else that thinks it is worth maintaining that slop for less.
Not everybody pushes themselves like that, nor should, its anything but healthy and sustainable. In my experience it takes... rather obsessed people, ocd or similar traits, maybe 2 out of 10 intensity of their disease. Highly functional, smart, yet unbalanced.
Llms just allow this spiral to go further, while human limits remain the same. Each of us creates our own path, dont mess it up just because you can. Your employer doesnt care much about you at the end, just another cog in machine but health once damaged may not bounce back, ever
70%+ saying that AI has increased their workload AND that they are burning out because is it.
When your manager and your company regulate your pace for you with the understood threat that not using AI will risk your job, you don't really have much of an option.