ZH version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
92% Positive
Analyzed from 811 words in the discussion.
Trending Topics
#agents#agent#system#chat#user#more#llms#aren#interface#software

Discussion (18 Comments)Read Original on HackerNews
You know what's a bad idea from an engineering (that thinky thing we used to do as part of building software) perspective?
Building a dependency on an expensive remote API into your system.
This isn't just me bloviating, I've been down this road before. In my case I had a project using LLMs to automatically edit videos provided by Hollywood content owners. It seemed like a decent application, but LLMs are structurally unsuited for dealing with user data like this. The way that the prompt is evaluated means there is no separation between system and user input, so once you start dealing with a wide variety of topics you pretty quickly run into walls.
One example - ChatGPT refusing to summarize and pick a top segment from a news program because it contained references to a murder-suicide, and both murder and suicide are included in the many prohibited topics that are filtered in ChatGPT replies. This was through their API, not the regular user interface, so it is in theory as unrestricted as access gets. But because the LLM cannot be trusted to behave properly around the topic, they have to filter anything which touches it.
Structurally, I don't see a way this can be overcome - LLMs by design mix the entire prompt together, it's not like a parameterized SQL query where you can isolate the user and system data. That means that a long or bold enough user input is often enough to outweigh the system prompt, and that causes the LLM to veer into unpredictable territory.
I tend to agree quite a bit.
I created a ambient background agent for my projects that does just that.
It is there, in the background, constantly analysing my code and opening PRs to make it better.
The hard part is finding a definition of "better" and for now it is whatever makes the longer and type checker happy.
But overall it is a pleasure to use.
If there’s one take away it’s that these agents need more, not less, oversight. I don’t agree at all with the “just remove a few tools and you can remove the human from the loop” approach. It just reduces the blast radius in case the agent gets it wrong, not the fact that it gets it wrong.
I crafted the AI loop to do exactly what I would be doing by manually.
Out of 10 PRs, 6 to 7 gets merged. The other simply get closed.
This stuff is negative value.
But the more you read the article the more the point is lost. The prescriptions given aren't ambient?
(seems you're talking to the AI above (and you'll need to refine just like a conversation), it's just not synchronously in chat)The gripe seems to be specifically with being able to chat with the AI. Yes, ideally the AI just knows to do stuff. But the chat interface is also the reason every Bob and Sarah has chatGPT in their pocket. It's also just growing pains.
Exactly the opposite is true. I couldn't even understand the point or relation being made here as the article continues to emit further disconnected revelations and factual errors. I would suggest a human calmly read through the post and sense check it.
It might be nice to have something simple and cheap for basic text classification, but I'm not sure what to use. (My websites are written in Deno.)
Moltbot is OpenClaw, AutoGPT was born significantly before. I just couldn’t read after the first paragraph, I’ve lost the trust entirely, whatever/whoever wrote it.
Doesn’t mean it’s a good idea, though.