ES version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
63% Positive
Analyzed from 7559 words in the discussion.
Trending Topics
#openclaw#claude#llm#something#more#agent#telegram#using#context#don

Discussion (178 Comments)Read Original on HackerNews
I have experimented with a lot of hacks, like hierarchies of indexed md files, semantic DBs, embeddings, dynamic context retrieval, but none of this is really a comprehensive solution to get something that feels as intelligent as what these systems are able to do within their context windows.
I am als a touch skeptical that adjusting weights to learn context will do the trick without a transformer-like innovation in reinforcement learning.
Anyway, I‘ll keep tinkering…
I haven't seen any issues with memory so far. Using one long rolling context window, a diary and a markdown wiki folder seems sufficient to have it do stuff well. It's early days still and I might still encounter issues as I demand more, but I might just create a second or third bot and treat them as 'specialists' as I would with employees.
- I suspect that in this moment, cobbling together your own simple version of a “claw-alike” is far more likely to be productive than a “real” claw. These are still pretty complex systems! And if you don’t have good mental models of what they’re doing under the hood and why, they’re very likely to fail in surprising, infuriating, or downright dangerous ways.
For example, I have implemented my own “sleep” context compaction process and while I’m certain there are objectively better implementations of it than mine… My one is legible to me and therefore I can predict with some accuracy how my productivity tamagotchi will behave day-to-day in a way that I could not if I wasn’t involved in creating it.
(Nb I expect this is a temporary state of affairs while the quality gap between homemade and “professional” just isn’t that big)
- I do use mine as a personal assistant, and I think there is a lot of potential value in this category for people like me with ADD-style brains. For whatever reason, explaining in some detail how a task should be done is often much easier for me than just doing the task (even if, objectively, there’s equal or higher effort required for the former). It therefore doesn’t do anything I _couldn’t_ do myself. But it does do stuff I _wouldn’t_ do on my own.
An LLM context is a pretty well extended short term memory, and the trained network is a very nice comprehensive long term memory, but due to the way we currently train these networks, an LLM is just fundamentally not able to "move" these experiences to long term, like a human brain does (through sleep, among others).
Once we can teach a machine to experience something once, and remember it (preferably on a local model, because you wouldn't want a global memory to remember your information), we just cannot solve this problem.
I think this is probably the most interesting field of research right now. Actually understanding in depth how the brain learns, and figuring out a way to build a model that implements this. Because right now, with backtracking and weight adjustments, I just can't see us getting there.
I like the idea of using small local model (or several) for tackling this problem, like low rank adaptation, but with current tech, I still have to piece this together or the small local models will forget old memories.
It's not how an llm can work right now, it needs too much iterations & a much bigger dataset than what we can work with. A single time experiencing something and we can remember it. That's orders of magnitude more efficient than an LLM right now can achieve.
This field of research has been around for decades, so who's to say when there'll be a breakthrough.
In fact, LLMs are great despite our very limited understanding, and not because we had some breakthrough about the human brain.
The way an llm learns is a very interesting way of doing it, but it sure isn't what the brain is doing.
But it's indisputable.. We can get enormous results with this technique. It's just probably not the way forward for faster learning to remediate the issue of context loss.
It's like spontaneous implemention of thought experiments from yesteryear. I wonder if all this product-focused experimentation will accidentally impact philosophy of mind after all...
I think we will eventually end up with models which can be individually trained and customised on regular schedules. After that, real-time.
I’ve had a Whatsapp assistant since 2023, jailbraked as easy assistant. Only thing I kept using is transcription.
https://github.com/askrella/whatsapp-chatgpt was released 3 years ago and many have extended it for more capabilities and arguably its more performant than Openclaw as it can run in all your chat windows. But there’s still no use case.
It’s really classification and drafting.
Yes you can automate via scripting, but interacting with a process using natural language because every instance could be different and not solid enough to write a spec for, is really handy.
tl;dr: there's a place for "be liberal in what you receive and conservative in what you send", but only now have LLMs provided us with a viable way to make room for "be loosey goosey with your transput"
I mean using AI is a great way to interpret a query, determine if a helper script already exists to satisfy it, if not invoke a subagent to write a new script.
Problem with your "script" approach is how does that satisfy unknown/general queries? What if for one run you want to modify the behavior of the script?
But there's still value in people exploring new spaces they find interesting, even if they do not meet your personal definition of pareto-optimal.
I love the concept but I've never hosted such a terrible piece of software. Every update breaks something new or introduces another "anti-feature" that's enabled by default.
The documentation is often lagging behind and the changelog has such a low signal to noise ratio that you need a LLM to figure out what upgrading will break this time. For now I've just given up on updates and I've been patching bugs directly in the JS when they bother me enough.
If OpenClaw is the future of software I'm honestly a bit scared for the industry.
I'm open to suggestions, I tried Zeroclaw and Nullclaw but they're bad in their own way. I would like something that's easy to run on Kubernetes with WhatsApp integration and most important, stable releases.
I think it's mainly the industry wannabes gathering around a "sexy" brand name again, when they're really more interested in "AI as personal assistants".
OpenClaw just has the most traction despite being a hot mess, because the people hyping it up don't know how bad the codebase is, or because they want to launch something first and switch it over to a more credible alternative after.
that sounds like an oxymoron.
But it’s so terribly unstable, it’s as if nobody actually tests things before pushing a release. I don’t need 2 updates per day, I just need one every few weeks that’s stable.
It's not as fun with SOUL.md etc but so far much less janky.
But if you think you need an agent framework to use a prompt you're going to love this one simple trick...
First of all is not an LLM, you're beholden to an api or local llm limitations. Second of all it's always calendars, email replies, summarizing.
You do not need an LLM for that, and an LLM doesn't make it easier either. It sounds like executive cosplay, not productivity. Everything I see people talking about that's actually productive, it's doing probabilistically when deterministic tools already exist and have for in some cases over 20 years.
You don't need an LLM to put a meeting on a calendar, that's literally two taps with your phone or a single click in gmail. Most email services already have suggestions already built in. Emails have been summarized for 10 years at this point. If you're so busy you need this stuff automated, you probably have an assistant, or you're important enough that actually using general intelligence is critical to being successful at all.
The idea of getting an LLM email response sounds great for someone who has never worked a job in their life.
This comment section is full of llm writen responses too, to the point where its absurd. Noticing how most of them just talk in circles like "But I think many people criticizing the various Claws are missing out on the cronjob aspect. There's value in having your AI do work automatically while you're asleep. You don't even need OpenClaw for that, just a cronjob that runs claude -p in the early morning. If you give your AI enough context about yourself, you get to a point where it just independently works on things for you, and comes to you with suggestions. It doesn't need to be specifically prompted. The environment of data it can access is its own context, its own prompt. With that, it can sometimes be surprising and spooky what you wake up to, without being directly prompted."
This literally isn't even saying anything. This paragraph does not mean anything. It's not saying what its doing, whats happening or what the result is, just "something is happening".
No, you didn't save time using openclaw, you just changed to managing openclaw instead of doing your actual job.
You don't need custom scripts for most things if its actually something that matters, most tools already exist, and if you do openclaw isn't going to help you do it.
I can't even tell if these replies are in fact just astroturfed bot armies flooding us with marketing or there really is an entire generation of people out there right now who can't do anything unless their phone is telling them what to do.
And where are the outcomes? Okay, you've got OpenClaw telling you every few hours how many calories you've had so far today. Have you gotten leaner? Faster? Stronger? Healthier or fitter by any quantifiable objective metric at all? Or are you just doing exactly what you did before but now your phone is scripting it for you?
Some people have developed personal coping strategies for neurodivergence, and can probably do better with some AI assistance.
I myself tend to live by my calendar. Even then, I may eventually forget to follow up on something until it's too late, because I'm busy or overwhelmed with other things.
When Claude Code was released, there was a community leaderboard where people competed who could waste the most tokens. Let that sinks.
I know people, especially people who write code, like to blame "the other clueless people" for ruining their cheap token plan. But we're not stuck in the traffic. We're the traffic.
My teams currently using it for:
- SDR research and drafting
- Proposal generation
- Staging ops work
- Landing page generation
- Building the company processes into an internal CRM
- Daily reporting
- Time checks
- Yesterday I put together proposal from a previous proposal and meeting notes, (40k worth)
All your use cases are fairly well handled by conventional LLM's. OpenClaw is a security nightmare, so it's probably worth switching away.
OpenClaw was never meant to be a tool that could do things you couldn't do without it.
Also, whenever someone points out you could accomplish something without it, he underestimates the effort needed. In the examples I'm thinking of, someone simply asked OpenClaw to do something, had a few back and forths with it, and it was done. I have yet to see someone say "Oh, I can do that without OpenClaw" and go ahead and do it within 10 minutes.
Not once.
OpenClaw is flawed, but the convenience is an order of magnitude higher than anything else.
You offered nothing to support this. My openclaw is realistically just an agent in discord versus the CLI. That's not an "order of magnitude" more convenient. Anthropic already has a tool for it https://code.claude.com/docs/en/remote-control
Method 2: send natural language instructions to OpenClaw to use Claude Code to do the same thing
Sorry my tiny brain says to me method 2 is doing the same thing with extra steps
10% done by an assistant that’s been trained on the task (or a dev or me)
80% heavy lifting done by claw
10% review and corrections
Nothing of what my agents do, we didn’t previously do. But now I can get moderate to good results with a lot less effort. Allowing the business to expand whilst keeping costs controlled.
* Telegram Health Group, created an agent to help me track sleep, recommend my supplements based on my location, remind me in the morning and evening to monitor my food. I send it images of what I eat and it keeps track of it. * Telegram Career Group, I randomly ask it to find certain kind of job posts based on my criteria. Not scheduled, only when I like to. * Telegram Coder Group, gave it access to my github account. It pulls, runs tests and merges dependabot PRs in the mornings. Tells me if there are any issues. I also ask it to look into certain bugs and open PRs while I'm on the road. * Telegran News Group, I gave it a list of youtube videos and asked it to send me news every day at 10am similar to the videos.
So far, it's a super easy assistant taking multiple personas. But it's getting a bit painful without CC subscription
Or the CIA has set up inside your closet with a listening device!
https://github.com/GhostPawJS/codex
I’d say it’s like 85% reliable on any given task, and since I supervise it, this is good enough for me. But for something to be useful autonomously, that number needs to be several 9’s to be useful at all, and we’re no world near that yet.
I’m currently watching someone trying and failing to roll openclaw out at scale in an org and they believe in it so much it’s very difficult to convince them even with glaring evidence staring them in the face that it will not work
For example, for the invitations in the OP: Have Openclaw write incoming rsvps to a database, probably a flat file here, and use the db as persistent memory: OpenClaw can compose outgoing update emails based on the database. Don't even suggest to OpenClaws that it try to remember the rsvps - its job is just writing to and reading from a database, and composing emails based on the latter. ?
Does that violate the experiment, by using some tool in addition to OpenClaw?
The other common use case seems to be kicking off an automated Claude session from an email / voicetext / text / Telegram, and getting replies back. I'm emailing Claude throughout the day now, and sometimes it's useful to just forward an email to Claude and ask it to handle the task within it for me.
But I think many people criticizing the various Claws are missing out on the cronjob aspect. There's value in having your AI do work automatically while you're asleep. You don't even need OpenClaw for that, just a cronjob that runs claude -p in the early morning. If you give your AI enough context about yourself, you get to a point where it just independently works on things for you, and comes to you with suggestions. It doesn't need to be specifically prompted. The environment of data it can access is its own context, its own prompt. With that, it can sometimes be surprising and spooky what you wake up to, without being directly prompted.
Give it enough context, long term memory, and ability to explore all of that, and useful stuff emerges.
Enforcement seems to be a combination of string matching for 3rd party sysprompts, heavy usage, and some random factor.
Not clear if there are any hard rules you can stay on the good side of, the only way to be safe seems to be to pay per token. (There goes the ~90% discount!)
--
Also yeah you get ~80% of Claw by shoving Claude Code in a Telegram Bot ;) It's already a general purpose computer use thing, people forget! (And it's a lot better at extending itself than the actual claws, lol)
I think the least illegal and also least bad option is to just use ngrok and tmux tho
I have some issues with the article, but I agree with some of the conclusions: It's great tinkering with it if you have time to spare, but not worth using weeks of your time trying to get a perfect setup. It's just not that reliable to use up so much of your time.
I will say, it's still amongst the best tools to do a variety of tasks. Yes, each one of those could be done with just a coding agent, but I found it's less effort to get OpenClaw to do it than you writing something for each use case.
Very honest question: One of the use cases I had with OpenClaw that I'm missing now that I don't use it: I could tell it (via Telegram) to add something to my TODO list at home while I'm in the office. It would call a custom API I had set up that adds items to my TODO list.
How can I replicate this without the hassle of setting up OpenClaw? How would you do it?
(My TODO list is strictly on a home PC - no syncing with phone - by design).
(BTW, the reason I stopped using OpenClaw is boring: My QEMU SW stopped working and I haven't had time to debug).
All the existing, commodity todo list apps on the market can't address your use cases?
At least I can't tell there is anything you can't do on your personal phone.
Nope. I've custom honed my TODO system since 2009. I'm not switching for some one else's app.
And I don't use phones.
Writing a script to make a POST request is something assistants have been able to do for quite a while now.
And if you have a Claude subscription, you can use Dispatch to directly write to your PC's drive, no API needed.
The general idea is make a simple deterministic program that runs on your PC at home in a never ending loop. Every minute or so, check Telegram for a new message. If a message is received, then the program runs "claude -p" with a prompt, whatever MCP tools or CLI permissions it needs, and the contents of your Telegram message. Just leave the program running on your home computer while you're out, and you're done.
I don't use Telegram, so coding the part to check Telegram would be the hard part. I use email instead, and have the program check every minute for new mail (I leave my email program running and check the local inbox file). I'd already coded up a local MCP server to manage my ToDo list (Toodledo) so Claude just calls the MCP tools to add the task.
However, it was really nice being able to use Telegram and get quick validation. I also had a flow set up where I could send a voice memo. It would take the audio file (ogg), run Whisper, and then pass through an LLM for cleanup, and follow the instructions in my message. Really handy to use while I'm walking around.
I guess I want to create my own OpenClaw like agent, but not with its crazy broad access: Just limited to the functionality I allow, and with the convenience of using Telegram. I don't care about memory, soul, etc.
My reverse audio reply loop is convoluted - I have Claude generate its TTS file from Whisper/Mistral, and upload them to a server with an RSS file it updates, so I can play them in my podcast app (AntennaPod), then send me a notification via Pushover that the reply is waiting. I ended up building out an MCP tool for that workflow, so Claude really just calls the MCP tool with the text of what it wants to say, everything else is a deterministic program doing the work.
Memory is really useful to have - it can just be a bucket of searchable Markdown files. It's also useful to have a "reminders to self" Markdown file that Claude reads each time, and that Claude can update. I don't continue the same context window, and that "reminders to self" plus the ability to read previous emails in the conversation seems to be enough to keep the context going for me.
You'll feel better if you know exactly how your Claw is locked down. Mine doesn't have the open email access others are granting, not at all. Claude gets a bit grumpy about that and keeps begging for more access :)
https://github.com/a-n-d-a-i/ULTRON
It also supports Codex :)
I felt pretty clever until (1) I found a repo where they used this trick to create a full OpenAI compatible API endpoint[0] (lmao, the VC money distortion field spawning truly comical Rube Goldberg machines), and (2) they started banning "unauthorized" usage of the Claude sub, which trend unfortunately seems to be accelerating recently as their lower value consumers have grown in both number and usage.
I think shoving claude -p in your bash script / cronjob / messaging app bot of choice counts as "unathorized 3rd party harness", but your guess is as good as mine...
(claude -p with per-token billing (i.e. paying 7x more) is allowed though, of course)
-- There's also an Agents SDK (formerly Claude Code SDK?) which is basically just claude -p but with more typing, as far as I could tell.
[0] https://github.com/router-for-me/CLIProxyAPI
[0b] Honorable mention https://github.com/kronael/claude-serve
What you are looking for is an orchestration platform such as n8n or windmill.dev. You can still have a telegram bot and still use LLM for natural language interaction, but it's much more controlled than OpenClaw. I do exactly what you describe, add todos to my todoist account from telegram.
You can do anything if you believe!
Re: QEMU: For the sandboxing I realized what I actually wanted was "it can't read/nuke my files", so I made a non-privileged linux user and added myself to its group. So I can read/write its files, but not the reverse.
You can use anything to call this API right? I have multiple iPhone shortcut that does this. Heck, I think you can even use Siri to trigger the shortcut and make it a voice command (a bit unsure, it’s been a while since I played with voice)
The API is on my home PC and not exposed to the outside world. Only OpenClaw via Telegram was. So my question is about the infrastructure:
How do I communicate with something at home (it could be the API directly) using a messaging app like Telegram? I definitely want an LLM in the mix. I want to casually tell it what my TODO is, and have it:
- Craft it into a concise TODO headline
- Craft a detailed summary
- Call the API with the above two.
I'm not asking in the abstract. What specific tools/technologies should I use?
If you aren't a programmer it's also the kind of small project that LLMs are great at, there are many examples ingested in their training data.
What seems to be somewhat working for me
1. Karpathy wiki approach
2. some prompting around telling the llm what to store and not.
But it still feels brittle. I don’t think it’s just a retrieval problem. In fact I feel like the retrieval is relatively easy.
It’s the write part, getting the agent to know what it should be memorizing, and how to store it.
And forcing to always orient itself with that repo map first seemed to really help it from tunnel visioning.
er, nevermind. prob just crazy castles in the sky wistful dreams :-)
I've removed it.
"The Claw."
Some of this stuff is starting to look like technologies that worked, looked promising, but were at best marginally useful, such as magnetohydrodynamic generators, tokamaks, E-beam lithography, and Ovonics.
OpenClaw runs Pi in a terminal and exposes the chat thru Telegram or any chatting app. This gave the ah-ha moment to non-coders that coders had had for 6+ months prior.
Last I checked, it doesn't!
The killer usecase is letting you make whatever you want, instead of being at the mercy of what your OS/platform dictates.
Your idea of a killer idea is a whatsapp summarizer lol.
The problem is if not carefully designed it will burn through tokens like crazy.
It's a rather simple framework around an LLM, which actually was a brilliant idea for the world that didn't have it. It also came with its own wow effect, ("My agent messaged me!") so I consider some of the hype as justified.
But that's pretty much it. If you can imagine use cases that might involve emailing an LLM agent and get responses that share context with other channels and resources of yours, or having the ability to configure scheduled/event-based agent runs, you could get some use out of having an Openclaw setup somewhere.
I find the people who push insanity like "It came alive and started making money for me" and the people who label it utterly, completely useless (because it has the same shortcomings as every other LLM-based product) like Mr. "I've Seen Things. Here's the Clickbait" here, rather similar. It's actually hard to believe they know what they're talking about or that they believe what they're writing.
I know that headlines are all about eyeballs, but this is seriously just exhausting. Headlines are advertisements and advertisements are about getting engagement. Surely having your audience just getting angry at them isn’t a good thing, right?
The author makes some good conclusions; I’m as AI-pilled as the next hopefully-not-soon-to-be-ex-software-engineer, and I struggled to find use cases for my Claw that couldn’t be served with a cronjob and $harness.
If your findings contradict that, we are all ears - genuinely.
The killer thing was remote control, but that’s here in Claude now. In my opinion claw has to reason to exist anymore.
I tried it, didn’t like it. It gave me the ick with the communication channel.
Sure, anything it does can be done better with specialized tooling. If you know that tooling.
The memory thing sounds like an implementation limit rather than something fundamentally unsolvable. Just experiment with different ways of organizing state until something works?
They can automate but they are not reliable. I think of them as work and process augmentation tools but this is not how most customers think in my experience.
However, here are a several legit use-case that we use internally which I can freely discuss.
There is an experimental single-server dev infrastructure we are working on that is slightly flaky. We deployed a lightweight agent in go (single 6MB binary) that connects to our customer-facing API (we have our own agentic platform) where the real agent is sitting and can be reconfigured. The agent monitors the server for various health issues. These could be anything from stalled VMs, unexpected errors etc. It is firecracker VMs that we use in very particular way and we don't know yet the scope of the system. When such situations are detected the agent automatically corrects the problems. It keeps of log what it did in a reusable space (resource type that we have) under a folder called learnings. We use these files to correct the core issues when we have the type to work on the code.
We have an AI agent called Studio Bot. It exists in Slack. It wakes up multiple times during the day. It analyses our current marketing efforts and if it finds something useful, it creates the graphics and posts to be sent out to several of our social media channels. A member of staff reviews these suggestions. Most of the time they need to follow up with subsequent request to change things and finally push the changes to buffer. I also use the agent to generate branded cover images for linkedin, x and reddit articles in various aspect ratios. It is a very useful tool that produces graphics with our brand colours and aesthetics but it is not perfect.
We have a customer support agent that monitors how well we handle support request in zendesk. It does not automatically engage with customers. What it does is to supervise the backlog of support tickets and chase the team when we fall behind, which happens.
We have quite a few more scattered in various places. Some of them are even public.
In my mind, the trick is to think of AI agents as augmentation tools. In other words, instead of asking how can I take myself out of the equation, the better question is how can I improve the situation. Sometimes just providing more contextually relevant information is more than enough. Sometimes, you need a simple helper that own a certain part of the business.
I hope this helps.
Like many here, I am struggling to see a meaningful delta between OC and CC but fully willing to accept that my skepticism is misplaced. Basically, I am in "trying to care about OC" mode right now.
Until it gets there, it’ll remain a fringe product.