FR version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
61% Positive
Analyzed from 1998 words in the discussion.
Trending Topics
#agent#sandbox#harness#code#don#api#llm#agents#model#user

Discussion (38 Comments)Read Original on HackerNews
Manus rebuilt its harness five times in six months. The model stayed the same, but the architecture changed five times.
LangChain re-architected Deep Research four times in one year.
Anthropic also ripped out Claude Code’s agent harness whenever the model improved.
Ever since Mitchell Hashimoto mentioned the harness in February, people have been trying to claim that concept. Eventually, someone will probably sell a book called Harness Engineering. I will buy it, of course. Then I will write a blog post about it that nobody reads, with a link that will be buried under ShowDead as soon as I submit it to HN.
And by that point, IT companies will start asking:
“You’re a new grad, right? You know harness engineering, don’t you?”
Having said that, some components need to live outside the sandbox (otherwise, who creates the sandbox?). Longer term, I see it as a dedicated security layer, not part of the harness. This probably has yet to emerge fully but it's more like a hypervisor type layer that sits outside of everything and authorises access based on context, human user, etc and can apply policy including mediate the human intervention for decision points when needed.
A lot of this post presents false dichotomies. It assumes the existence of a sandbox that is by definition ephemeral or "cattle-like". Why? There are reasons to do that and reasons not to do that. You can have a durable computer with a network identity and full connectivity, and you can have that computer spin down and stop billing when not in use.
There are a zillion different shapes for addressing these problems, and I'm twitchy because I think people are super path-dependent right now, and it's causing them to miss a lot of valuable options.
[1]: https://fly.io/blog/tokenized-tokens/ (I work at Fly.io but the thing this post talks about is open source).
I've heard many claims that because LLMs are tuned to specific harnesses, we should expect worse performance with novel architectures. That seems to make people reluctant to try to put effort into inventing them.
I don’t get it. Calling an API requires a sandbox in most cases. The others could be abused in service of an un-sandboxed agent with API access.
If the harness is outside the sandbox then it’s just an ambiguous and confusing security model and boundary.
I'm not following why this would this be the case? The purpose of calling the API is to get data or effect a state transition on some remote service, but I don't follow why the originating machine matters.
Or is your objection about auth?
I think the confusion is that “agent” is used for two very different things:
- building an agent
- an “agent” product/runtime (Claude Code, etc)
In the first case, the model never executes anything. It just outputs something like “call this API”. Your code is the one doing it, with whatever validation you want. There’s no need for a sandbox there because there’s no arbitrary execution.
On exe.dev the agent (Shelley) runs in a Linux VM, which is the security boundary. All the conversations are saved to a sqlite database, and it knows how to read it, so you can refer to a previous conversation in the database. It's also handy for asking the AI to do random sysadmin stuff, since it can use sudo.
A downside is that there's nowhere in the VM where secrets are safe from possibly getting exfiltrated via an injection attack. But they have "integrations" where you can put secrets into an http proxy server instead of having them locally.
Also, you don't need to use AI at all. You can use the VM as a VM.
Arguably this is a feature not a bug. Conflict resolution forces the need for a process to come to agreement on a common source of truth - one of the reasons why most Git repos don’t allow users to push to main directly. Writing directly to a shared memory database seems like it would result in chaos and a host of side effects once the number of users scales.
-What remains unsolved is what should an Agent reasonably have access to in what context and for how long (etc).
Probabilistic code that can run far faster than human driven code, we don’t have a great model yet. We all should spend our energy there…
- Separating / putting controls on the FS resource is no different than putting the agent behind a firewall / allow-deny list.
It doesn’t invalidate running a sandbox in a sandbox to have better security.
But shouldn't there really be another sandbox where the agentic tool calls execute? This is to contain the damage of the tool execution when it goes wrong.
And, the agent harness itself should either implement or be contained in a third sandbox, which should contain the damage of the agent. There should be a firewall layer to limit what tool requests the agent can even make. This is to contain the damage of the agent when it formulates inappropriate requests.
The agent also should not possess credentials, so it cannot leak them to the LLM and allow them to be transformed into other content that might leak out via covert channels.
At the end of the day, it’s a “simple” loop that calls an external API (LLM) and receives requests to execute stuff on its behalf.
It’s not the agent running bash commands: you (the harness author) are, and you’re in full control of where and how those commands get executed.
In the article’s case, bash commands are forwarded to a sandbox, nothing ever runs on the harness itself (it physically can’t, local execution is not even implemented in the harness).
Tools, memories, sandboxing, steering, etc
The harness is the part that makes the API calls, interacts with the user, makes the function calls, and keeps track of the conversation memory.
You can also use the LLM to summarize the conversation into a single shorter message so you get compaction. And instead of statically defining which functions are available to the LLM you can create an MCP server which allows the LLM to auto-discover functions it can call and what they do.
That’s the whole magic of something like Claude Code. The rest is details.
Personally, for me it embodies a level of autonomy. I define that as, an AI model with potential to interact with something external to itself based on its output, where that includes its own future behavior.
1) It's still assuming agents have CLIs. This is a very developer-centric concept of agents, and doesn't map well to either consumer or enterprise agents that aren't primarily working with files. Skills, plans, TODO lists, and memory are good, but don't have to be modeled as raw file access. Many harnesses have tools for them.
2) It's talking about a singular sandbox. That's not good enough for prompt injection prevention, secure credential management, and limiting the blast radius of attacks.
Another benefit of moving the harness outside the sandbox is you get to avoid accidentally creating a massive distributed system and you therefore don't have to think so much about events/communication between your main API and your sandboxes.
- Easy single command CLI agent spawning with templates
- Automatic context transfer (i. e. a bit like git worktrees)
- Fully containerised, but remote (a bit like pods)
- Central, mitm-proxy zero trust authn/authz management (no keys or credentials inside the agents), rather enrichment in the hypervisor/encapsulation
- Multi agent follow-up functionalities
- Fully self hosted/FOSS
Basically a very dev-friendly, secure, "kubernetes"-like solution for running remote agents.
Anyone has an idea of how to achieve this or potential technologies?
The reason agents work is because they have access to stuff by default. The whole world is context engineering at this point, and this proposal is to intermediate the context with a bespoke access layer. I put the bare minimum into getting my dev instance into a state where I can develop, because doing stuff (and these days: getting my agent to do stuff) is the goal.
This makes slightly more sense if you're building a SaaS and trying to get others to give you access to their code, their documents, and the rest so you can run agents against it. But the easiest, most powerful way is to just hook the agents up to the place that's already set up.
This problem is quite common and not limited to memories. For instance, Claude Code will block write attempts and steer the agent to perform a read first (because the file might have been modified in the meantime by the user or another agent).
Same principle here: rather than trying to deterministically “merge” concurrent writes, you fail the last write and let the agent read again and try another write
Anyway. General advice: treat harnesses as any other (third-party) software that you run on your server. Modern harnesses (the ones from big companies, you need to subscribe to) are black boxes. Would you run a random binary you fetched from the internet on your server? Claude code, codex etc. are exactly this.