ZH version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
49% Positive
Analyzed from 2498 words in the discussion.
Trending Topics
#agents#things#agent#more#open#still#actually#user#don#same

Discussion (56 Comments)Read Original on HackerNews
When compared to how human make a mess of things like in the real world, how high does the bar really need to be for trusting AI agents. Even far shy from perfect, AI could still be a step function improvement over trusting ourselves.
The issue with this is that you want to impune, in the grand scheme of things, a small few individuals. And so you want to institute an AI system. Which are controlled by the same individuals (or at least the same class of individuals, with the reach to abuse such a system).
I'll hear you out if AI becomes truly decentralized. Until then, no, this line of rhetoric is just justification for the surveillance state that's to come (to be fair, the surveillance state would pick yet another justification, regardless).
The SOTA models are working on making them more capable and then adding guardrails for safety. It would be better to work on baking in incentive alignment, which probably means eliciting more incentive details from the LLM user. That's what I'd be working on at Apple, where the user might be induced to share a level of local-only details that could align the AI agents.
You can hold a human responsible for that they do; you can reward them, fire them, sue them, etc.
You cannot do any of those things with an LLM. The threat of termination means nothing to an LLM.
Like, there isn’t enough hype the world to make people replace all knives, hammers, and screwdrivers with sawzalls. They have awesome utility for certain things and they’re a bad fit for other things.
Maybe we’ll get there with LLMs someday.
How many people say something like, “if I recall correctly.” This statement emphasizes that we think we know, but we’re just adding that disclaimer to protect ourselves from cancel culture.
People call that “Hallucination” when talking about an AI. It’s not hallucination, it’s beautiful imperfection.
Here is a fresh example from today of what junior employee do when given unlimited agentic power : https://www.reddit.com/r/ClaudeAI/comments/1sv7fvc/im_a_nurs...
I think you will find it very hard to keep a Jr dev in a Corp responsible.
I actually think you will find that it is easier to work with agents at a higher quality and lower legal risk than using Jr developers.
And this is only going to be amplified when it becomes common knowledge that Ai poses less risk to projects, than Jr staff.
But in my opinion, it is not even remotely close to the reliability of an educated human, communication wise.
If you gave a research task to a less experienced person, you wouldn’t expect them to convincingly lie about details.
It is useful as a review tool or boilerplate generator but it is not the same aspect you would use a human from.
The same applies to an AI model.
And, since the same model would be deployed by many teams, unexpected behavior from that model even for a small subset of those teams means that it can't be promoted.
https://en.wikipedia.org/wiki/Four_stages_of_competence
I feel like people may be viewing the past with rose colored glasses. Computing in the 90s meant hitting ctrl-s every 5 seconds because you never knew when the application you were using was going to crash. Most things didn't "just work", but required extensive tweaking to configure your ram, sound card... to work at all.
The tower of abstractions we're building has reached a height that actually makes everything more fragile, even if the individual pieces are more robust.
That was in the Windows world. Maybe in the Mac world too?
No so much in the *nix world.
Windows seems to have improved its (crash) reliability since then though, which I suppose is nice. :)
Have people outgrown this unnecessary habit? Haha
Muscle memory is a bitch!
This is the issue; agents introduce more unexpected behavior, at least for now.
My gut is that always on "agents who can do things unexpectedly" are a dead-end, but what AI can do is get you to a nice AND predictable "workflow" easier.
e.g. for now I don't like AI for dealing with my info, but I love AI helping me make more and better bash scripts, that deal with my info.
I had occasional crashes, sure, but unless you had some very dodgy computers, it seems like you're overcorrecting for those supposed rose-colored glasses.
I never knew anyone in the '90s who was constantly living in fear of their programs crashing and losing their work.
I used computers back then and many things just worked fine. I found Windows XP way more predictable and stable than any of its successors.
THIS.
I lost so much work in the 90s and 00s. I was a kid, so I had patience and it didn't cost me any money. I can't imagine people losing actual work presentations or projects.
Every piece of software was like this. It was either the app crashing or Windows crashing. I lost Flash projects, websites, PHP code.
Sometimes software would write a blank buffer to file too, so you needed copies.
Version control was one of my favorite discoveries. I clung to SVN for the few years after I found it.
My final major loss was when Open Office on Ubuntu deleted my 30 page undergrad biochem thesis I'd spent a month on. I've never used it since.
If you merely block a specific action, they will find another way to do what they're trying to do. Agent security requires controlling the agent's intent.
So I'm sympathetic to the criticism, especially since composition of formal methods & analyzing their effects is still very much a hard problem (and not just computationally - philosophically, often, for the reason I listed above).
That being said, I don't know a better solution. Begging the agent with prompts doesn't work. Are you suggesting some kind of mechanistic interpretability, maybe?
mcp gives you open standards on the tool layer but the harness (claude code, cursor) is still proprietary. your product is one anthropic decision away from breaking.
the user agent role the post calls for needs open harnesses, not just open standards. otherwise we end up rebuilding mobile under a new name.
[1] https://github.com/mistralai/mistral-vibe
[2] https://goose-docs.ai/
- https://github.com/badlogic/pi-mono/ - https://github.com/anomalyco/opencode
if you've actually migrated an existing claude code setup to one of them, curious how the portability story worked. that's the part i'd been worried about.
OpenCode is another one to consider looking at: https://opencode.ai/ Not sure I'd recommend it, but it's worthy of consideration, as is Pi.
Also, consider that you can build your own. I've got Claude Code in the background working on improvements to my own harness (just for myself) at the moment. Though my intention is to have a mini API-only Claude Code that I can use on retro machines that don't support it, I don't need a full Claude Code feature set.
The problem is that the agent itself is the attack surface. An adversary who controls the communication channel can manipulate what the agent believes about who it's talking to, which means anything it holds, its list of authorized actions, a shared secret you gave it, whatever, can be exfiltrated in ways the agent can't detect because the manipulation happens below the layer where it can reason about trust.
Open harnesses and open standards help but they don't close this gap, because the thing you need to trust, the agent's own judgment about its principal, is exactly what gets compromised. The trust chain has to go below software entirely: hardware attestation, signed commands with keys the agent can verify but never access. That's really an OS problem dressed up as an agent architecture problem.
AI agents are the destination. No return click to bargain with. That's why Cloudflare just went default-block + 402 Payment Required instead of waiting on a standards body.
Open standards on the agent side are the easy half. Getting sites to show up is the part W3C can't fix alone.
Second half: specious claims about AI mostly based on a vague "we don't know what they can do, so maybe they can do anything?" rhetorical maneuver.
There is no legitimate intermediate position - The skew will go one way or the other.
Such a thing can’t be enforced and it can be flipped on a dime.
You should play around with local LLMs and system prompts to experience it.
if you dont recognize the technical limitations that produced agents youre wearing rose tinted glasses. LLMs arent approaching singularity. theyre topping out in power and agents are an attempt to exentend useful context.
The sigmoid approacheth and anyone of merit should be figuring out how the harness spits out agents, intelligently prunes context then returns the best operational bits, alongside building the garden of tools.
Its like agents are the muscles, the bones are the harness and the brain is the root parent.