The problem isn't just secret sprawl. It's that there's no lineage of access. You don't know which developer launched which agent, what it accessed, or whether it should have been allowed to. The moment you hand raw credentials to a process, you've lost the ability to enforce policy, audit access, or rotate without pain. The credential is the authorization, and that's fundamentally broken when autonomous agents are making hundreds of API calls per session.
Kontext takes a different approach. You declare what credentials a project needs in a .env.kontext file:
GITHUB_TOKEN={{kontext:github}}
STRIPE_KEY={{kontext:stripe}}
LINEAR_TOKEN={{kontext:linear}}
Then run `kontext start --agent claude`. The CLI authenticates you via OIDC, and for each placeholder: if the service supports OAuth, it exchanges the placeholder for a short-lived access token via RFC 8693 token exchange; for static API keys, the backend injects the credential directly into the agent's runtime environment. Either way, secrets exist only in memory during the session — never written to disk on your machine. Every tool call is streamed for audit as the agent runs.The closest analogy is a Security Token Service (STS): you authenticate once, and the backend mints short-lived, scoped credentials on-the-fly — except unlike a classical STS, we hold the upstream secrets, so nothing long-lived ever reaches the agent. The backend holds your OAuth refresh tokens and API keys; the CLI never sees them. It gets back short-lived access tokens scoped to the session.
What the CLI captures for every tool call: what the agent tried to do, what happened, whether it was allowed, and who did it — attributed to a user, session, and org.
Install with one command: `brew install kontext-dev/tap/kontext`
The CLI is written in Go (~5ms hook overhead per tool call), uses ConnectRPC for backend communication, and stores auth in the system keyring. Works with Claude Code today, Codex support coming soon.
We're working on server-side policy enforcement next — the infrastructure for allow/deny decisions on every tool call is already wired, we just need to close the loop so tool calls can also be rejected.
We'd love feedback on the approach. Especially curious: how are teams handling credential management for AI agents today? Are you just pasting env vars into the agent chat, or have you found something better?
GitHub: https://github.com/kontext-dev/kontext-cli Site: https://kontext.security

Discussion (24 Comments)Read Original on HackerNews
> Kontext holds secrets server-side and mints short-lived tokens per session.
That probably makes this thing DOA for most people (certainly for me and everyone I know).
We'll think how to best accomodate full self-hosting in the future!
What prevents the agent from presisering or leaking the API key - or reading it from the environment?
We need this also for normal usage like development environments. Or when invoking a command on a remote server.
Are you going to add support for services that don't support OIDC or this going to be a known limitation?
[1]: https://github.com/onecli/onecli
Workflow: OneCLI runs as a self-hosted Docker gateway — you route agent traffic through localhost:10255. Kontext doesn't change how you use Claude Code at all, just kontext start --agent claude.
Visibility layer: OneCLI intercepts outbound HTTP requests. Kontext hooks into Claude's PreToolUse/PostToolUse events, so you see bash commands, file ops, and API calls and not just network traffic.
Trust model tradeoff worth naming: OneCLI is fully self-hosted. Kontext holds secrets server-side and mints short-lived tokens per session. We do this via token exchange, RFC 8693, and natively build upon Oauth to support only handing over short-lived tokens - you don't need to capture refresh tokens for external tool calls at all.
So if Claude Code invokes Bash and runs curl ..., we see that tool invocation. If it invokes Bash and runs python script.py, and that script makes HTTP requests internally, we still see the Bash invocation.
[1] https://tailscale.com/blog/aperture-self-serve
Aperture solves “make multiple coding agents talk to the right LLM backend through an Aperture proxy.” We solve “launch a governed agent session with identity, short-lived third-party credentials, and tool-level auditability.” They overlap at the launcher layer, but the security goals are different.
I was actually just about to get started writing this but in Rust....
Nice work