Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
70% Positive
Analyzed from 2416 words in the discussion.
Trending Topics
#memory#agent#code#project#claude#llm#system#things#something#remember
Discussion Sentiment
Analyzed from 2416 words in the discussion.
Trending Topics
Discussion (76 Comments)Read Original on HackerNews
I’ve found the latter approach to work much, much better than simple “store”/“remember” systems.
So, it just feels misleading to say this can do what Claude.ai’s can do…
(I’ve been looking for a memory system that works the same for a while, so that I can switch away from Claude.ai to something else like LibreChat, but I just haven’t found any. Might be the only thing keeping me on Claude at this point.)
-
*I say Claude.ai because that’s specifically what has the system; Claude Code doesn’t have this system
ie: "$recall words"
it works but its clunky
Digging deeper I can see it is effectively pg_vector plus mcp with two functions: "recall" and "remember".
It is effectively a RAG.
You can make the argument that perhaps the data structure matters but all of these "memory" systems effectively do the same and none of them have so far proven that retrieval is improved compared to baseline vector db search.
In a way, if it does accomplish that, it is a vectordb needing glorification.
Together with the other hundred llm memory systems: https://zby.github.io/commonplace/agent-memory-systems/
I have also written a wishlist for these systems: https://zby.github.io/commonplace/notes/designing-agent-memo...
When we create anything, what we ended up not doing is often more important than what we did end up doing. My utility runs at the end of the session and captures all the alternatives we rejected, and the associated rationales, and stores that as system knowledge.
Basically, I want to capture all these things that my coworkers know, but that I can't just grep the code for. So far it's worked well, but it's still early.
Not casting aspersions on you personally, I’d really like this from every project, and would do the same myself.
Or are you all suggesting we should be comfortable with, and never question, flowery unchallenged advertising copy?
I doubt many people will honestly admit they did no design, testing and that they believe the code is sub par.
It does give me an idea that maybe we need a third party system which can try and answer some of the questions you are asking… of course it too would be LLM driven and quite subjective.
I'd doubt any engineer that doesn't call most of their own code subpar after a week or two after looking back. "Hacking" also famously involves little design or (automated) testing too, so sharing something like that doesn't mean much, unless you're trying to launch a business, but I see no evidence of that for this project.
Well no, but if people want to see a statement like this, and given that most people will want to be at least halfway honest and not admit to slop, maybe it will help nudge things in the right direction.
If you care that much and don't have a foundation of trust, you need to either verify the construction is good, or build it yourself. Anything else is just wishful thinking.
We even ask when cakes are made in house or frozen even though they look and taste great (at first).
The only approach I've found that works is no memory, and manually choosing the context that matters for a given agent session/prompt.
A friend told me he would like Claude to remember his personality, which is exactly what Gemini is trying to do.
A machine pretending to be human is disturbing enough. A machine pretending to understand you will spiral very far into spitting out exactly what we want to read.
1) An up-to-date detailed functional specification.
2) A codebase structured and organized in multiple projects.
3) Well documented code including good naming conventions; each class, variable or function name should clearly state what its purpose is, no matter how long and silly the name is. These naming conventions are part of a coding guidelines section in Agent.md.
My functional specification acts as the Project.md for the agent.
Then before each agentic code review I create a tree of my project directory and I merged it with the codebase into one single file, and add the timestamp to the file name. This last bit seems to matter to avoid the LLM to refer to older versions and it’s also useful to do quick diffs without sending the agent to git.
So far this simple workflow has been working very well in a fairly large and complex codebase.
Not very efficient tokens wise, but it just works.
By the way I don’t need to merge the entire codebase every time, I may decide to leave projects out because I consider them done and tested or irrelevant to the area I want to be working on.
However I do include them in the printed directory tree so the agent at least knows about them and could request seeing a particular file if it needs to.
In practice, as it grows it gets just as messy as not having it.
In the example you have on front page you say “continue working on my project”, but you’re rarely working on just one project, you might want to have 5 or 10 in memory, each one made sense to have at the time.
So now you still have to say, “continue working on the sass project”, sure there’s some context around details, but you pay for it by filling up your llm context , and doing extra mcp calls
If I am working on a real project with real people, it won’t have the complete memory of the project. I won’t have the complete memory. My memory will be outdated when other PRs are merged. I only care about my tickets.
I am starting to think this is not meant for that kind of work.
There is lots of competition in this space, how is your tool different?
[0] Ziva.sh is a desktop app that brings agentic features to game engines. We can't just bundle a running DB and we won't be sending this sensitive info to a cloud
How does it fight context pollution?
I keep two files in each project - AGENTS (generic) and PROJECT (duh). All the “memory” is manually curated in PROJECT, no messy consolidation, no Russian roulette.
I do understand that this is different because the vector search and selective unstash, but the messy consolidation risk remains.
Also not sure about tools that further detach us from the driver seat. To me, this seems to encourage vibe coding instead of engineering-plus-execution.
Not a criticism on the product itself, just rambling.
AI is the future, so we need cursors of the future that simulate the frustrating lag and imprecision of LLMS. Dots chase other little dots around and do inscrutable little animations.
Actual answer: You need javascript to see their dumb custom cursor.
How many are we up to now? Has to be hundreds of them.