FR version is available. Content is displayed in original English for accuracy.
Last year, we set to answer the question “If AI can write code 100x faster, then why aren’t you shipping 100x faster?” What we learned shocked us — even fairly nontechnical people and solo founders told us they were spending more than half of their development time reading the AI-written code. And much of the rest of the time was spent either de-slop-ping it, or wishing they had done so.
As luck turns out, our last two products were a tool that quickly onboards people to large codebases ( https://x.com/0xjimmyk/status/1873357324229984677 ) and trainings that taught deep concepts of code quality to CEOs, YC founders, and engineers at top companies ( mirdin.com ), so we were extremely well-positioned to solve these problems.
Command Center is an agentic coding environment focused on quality. With a few keypresses, you can start building 3 features at once and soon have 3 diffs ready, each consisting of 2000 changed lines across 50 files….
This is normally the point where you think “Crap, what now?”
With Command Center, at this point you simply click “Refactor,” and watch the vibed slop turn into readable robustness. Then you click “Generate Walkthrough,” and then suddenly, to read a 2000 line diff, instead of scrolling up and down trying to make sense of it, you just press the right arrow key 200 times. See something you don’t like? Click on line 37, type “Do this and all other network fetches in the background Cmd+Enter,” and you have a few more agents getting your code into final shape. Click or type “Commit,” “Push,” “Create PR” — you just shipped a high quality, non-slop feature
We’re striving to be the best at every step of the pipeline, but can just try Command Center in pieces wherever you feel your current workflow is weakest. We have users who do all their coding in Zed or the Codex app, and then jump over to Command Center for a walkthrough when it finishes running. There’s even a skill that will pop open a Command Center walkthrough from the environment of your choice. Or you can just keep Command Center running while you do your work elsewhere, and if your AI deletes anything, you have Command Center’s snapshots to the rescue.
We launched quietly last year and have been refining since. The quality and usability have kept going up, and Command Center is now ready for a lot more attention.
Since our quiet launch, we’ve seen at least a dozen other agentic coding environments appear….approximately all of which have the same feature set focused on the part which is already easy (generating the first version of the code) and with at best a shoddy answer to the hard part (everything that comes after). Command Center’s focus is making the hard parts easy.
Here’s what our users have to say:
“[The refactorings] give your LLM taste. I’ve never seen an LLM write code this good before.” — Doug Slater, Staff Engineer, Climavision
“With Command Center walkthroughs, I can get through a 400-line diff in less than half the time.” — Prateek Kumar, Platfor Engineer, Sumo Logic
This product is not for everyone. If you’re someone who preaches “the prompt is the source, the code is the compiler output,” then you probably won’t enjoy Command Center.
But if you want to uphold traditional engineering discipline while also shipping 20 PRs a day, then this is the environment for you.

Discussion (29 Comments)Read Original on HackerNews
0: https://plannotator.ai/
1: https://linear.app/docs/diffs
On the one hand, it is true that the website code was pushed 0 minutes before this announcement went up.
On the other hand, I tested just now on two different phones and didn't see any issues. Can you say in more detail what you expected vs. what actually happened?
There was an occlusion issue on some smaller screens, but it's been fixed now.
I guess that’s OK, but I was skateboarding at 19.
Can you even kick flip?
It's extremely hard to convince myself to use a product for the huge variety of often sensitive agent tasks when it's not open source. I understand the business reasons for that, but it's unusual in this space at the moment.
Instead: Can you post any independent security assessments perhaps? Fundamental things like SOC2?
The basic answer is that it runs locally. If you turn telemetry off and don't use our free Gemini credits, it's trivial to verify that no traffic goes to our servers other than a tiny subscription check. For our enterprise customers, we offer a version that doesn't even do that. Everything stays between you and your model providers (and we support custom and local models).
SOC2 is still a work in progress. I'm a former security researcher with work featured in the New York Times, and I know that doing it right (and not going through Delve) takes time. I can tell you that we have passed a compliance check for a company in a highly-regulated space.
I didn't find your contact info, but I'm available at jimmy@cc.dev, and happy to discuss your needs.
Doug has not signed up for it.
I can tell you that we do extensive testing, we figured out how to objectively measure the code quality on certain benchmark problems, empirically it's extremely helpful nearly all the time.
But in the general case: it is not actually possible to guarantee this.
That's because whether a change improves the code often depends on information which is literally not present in the codebase.
Some of these are more trite. E.g.: whether a comment is helpful or redundant slop depends on the audience.
Some are deeper. E.g.: whether a piece of duplication is good or bad depends on the intent, and that is often impossible to recover from the source. https://www.pathsensitive.com/2018/01/the-design-of-software...
A simpler example: There's a function that's never called. Should it be deleted?
There's a number of factors outside the codebase that determine the answer. Including the obvious one "Not if your next prompt is going to start using it."
It's pretty expensive to measure even for small programs. It's also more of a relative than an absolute measure, i.e.: it scores two variants of the same codebase, but the raw scores aren't very meaningful on their own. So our goal had been to use this in the benchmark set we're working on when we release a standalone refactoring product.
But the more I think about this suggestion, the more I think: "Hmmm, why not?"
It seems like an interesting tool, curious about trying it out once it's been out for a while. But who in holy hell, with AI assistance or not, could possibly "ship" (merged?) 20 PRs a day and still know what they're doing?
You talk a lot about quality and making sure to avoid slop, but there is no way in heaven you can ship 20 PRs and still ship quality design/architecture/code and avoiding slop.
I'd be curious to see some of those PRs if you're saying you've essentially solved the holy paradox of "ship fast = shit code" or "ship slow = good code".
Three things made that possible.
The first, obviously, is having Command Center.
The second is that a lot of those were fixes or UX improvements under 100 lines.
The third is, no joke, not sleeping. I've had quite a few 20+ hour days in the last 6 months. Some of that is work pressure, but also I've considered getting evaluated for a broken circadian rhythm.
> I'd be curious to see some of those PRs if you're saying you've essentially solved the holy paradox of "ship fast = shit code" or "ship slow = good code".
If you're serious, I'll be happy to get on a call and show you.
> but also I've considered getting evaluated for a broken circadian rhythm.
Heh, personally I fixed this by just adopting the sleep cycle my body wants of going to bed at 04:00/05:00 and going up at 11:00/12:00, life is much better now when I just accept it. One approach if your life can allow it :)
> If you're serious, I'll be happy to get on a call and show you.
Very much so, obviously prefer something async if possible, just a .patch file could suffice I suppose, but could do a call to have a look if that's the only way :) Reach out to my email from my profile and we can coordinate :)
That was my life in my mid-late 20's.
But as I've gotten older, my sleep schedule has only gotten more messed up. Now I consider it a victory if I manage to go to sleep before the dawn.
> Very much so, obviously prefer something async if possible, just a .patch file could suffice I suppose, but could do a call to have a look if that's the only way :) Reach out to my email from my profile and we can coordinate :)
Cool, let's chat async then. Contacting you now.
The most difficult code in the 1.0 release is some gymnastics to avoid the appearance of a concurrency conflict with a user running their own jj commands, made at the request of the person who introduced me to jj.
The final moments before this launch announcement consisted of me twiddling my thumbs while waiting for our designer to upload any version he could get ready in time that is better than the previous version of our website. So we knew we'd be launching with a lot of imperfections in the visuals. Did test in mobile, but not on iPad.
But yes, that is indeed what happened. Multiple times, I'd talk to someone that I'd expect to not be reading the code at all (solo founder, mostly nontechnical), then I'd interview him in detail about his workflow and think "Huh, there was absolutely no point in there where he was reading stuff," and then I'd ask "So how much of your time is reading code?" "60, maybe 70%"