Earlier this year we started seeing agent traffic in our logs and it looked like coding agents were calling our CLI. But that CLI wasn't designed with coding agents in mind. We went down a philosophical rabbit hole to see if a CLI is even needed anymore given that Claude, Copilot et al. already follow best practices. Ultimately we decided to create a new CLI from the ground up with coding agents in mind for two reasons:
1. We optimized the CLI for agent callers and cut Claude's output token usage by up to 79% and API cost by up to 67% versus a bare-Claude baseline. We wrote a blog documenting our lessons on optimizing user token usage when designing a CLI, e.g. using predicate flags so the agent doesn't compose jq | python | wc pipelines, output format that strips JSON's redundant field names. The blog is here: https://www.infracost.io/resources/blog/we-cut-claude-s-toke...
2. With cloud costs, precision matters. Telling a coding agent "make this Terraform cost-optimized" can be expensive and lossy. You burn tokens loading code and policy context into every conversation. Your agent could make up a price and you wouldn't know because it's difficult to verify that across the ~10M price points that AWS, Azure and Google have. The CLI runs static analysis on the code, uses the latest prices from cloud vendors, and passes that context to the coding agent.
So that's what we're launching today - Cost.dev: https://cost.dev/.
- It runs locally. Your code never leaves your machine, you get a fast feedback loop, and you're not burning API calls per character when you want to fetch prices.
- The CLI does the deterministic work. Fetching price points, scanning the code, validating fixes. The coding agent does the natural-language part. You don't have to trust the LLM to remember the rules, and can verify it called the right CLI command.
- It provides a consistent rule layer across every tool you use. Get cost estimates in your IDE and your coding agent with a single install. We support Claude Code, GitHub Copilot, Cursor, Windsurf, OpenAI Codex, Gemini CLI, as well as IDEs like VS Code and JetBrains
Before we keep building more in that direction, I want to sanity-check with HN: is "agents writing IaC in prod" actually a thing yet, or am I betting on a future that's still a year out? I know software developers are using coding agents heavily, but are platform/infra folks doing that for prod too? Also, if you have any feedback on Cost.dev, I'd love to hear it!

Discussion (23 Comments)Read Original on HackerNews
I can see why YC is interested in this issue, as I'm sure lots of startups are trying to stretch that runway.
Each of them are making a lot of decisions on the infra. and that combines with the crazy pricing models from the cloud providers was saving companies a lot of money.
Then, we saw how much time is saved when you catch it at this point vs after the fact. Basically avoiding a bunch of tech debt
What can definitely happen though is you get one that is inappropriate in a given context. An example here might be a recommendation from an m5.2xlarge to an m6g.2xlarge instance. Same vCPUs and memory, lower cost, but... also a switch from Intel -> ARM architectures. For a lot of companies their build pipelines make it easy enough to make that change. For others there may be some specific dependency on Intel for that workload which means changing the architecture isn't viable. In that case you can simply dismiss the recommendation and we'll stop suggesting it.
We've found even more improvements since that post so those will be shipping soon too.
I don't know if we'll keep dissecting every incremental improvement we make as (so far) the general approach is the same as documented in the existing blog post: document common use cases -> benchmark them -> identify bottlenecks/expensive hot spots -> fix them -> repeat
The main thing changing right now is observing new more frequent use cases (either because we're adding new capabilities, or users are doing things we didn't entirely predict) and adding them to the test cases.