HI version is available. Content is displayed in original English for accuracy.
This implementation experiments with a biological approach by using the Ebbinghaus forgetting curve to manage context as a living substrate. Memories are assigned a "strength" score where each recall reinforces the data and flattens its decay curve (spaced repetition), while unused data eventually hits a threshold and is pruned.
To solve the "logical neighbor" problem where semantic search misses relevant but non-similar nodes, a graph layer is layered over the vector store. Benchmarked against the LoCoMo dataset, this reached 52% Recall@5, nearly double the accuracy of stateless vector stores, while cutting token waste by roughly 84%.
Built as a local first MCP server using DuckDB, the hypothesis is that for agents handling long-running projects, "what to forget" is just as critical as "what to remember." I'd be interested to hear if others are exploring non-linear decay or similar biological constraints for context management.

Discussion (27 Comments)Read Original on HackerNews
Seems to maybe be useful but I’m not sure yet.
I've stopped trying to achieve general "memory". I just ask the agent to thoroughly, but concisely, document each project. If it writes developer documentation and a development plan/roadmap, as though a person was going to have to get up to speed and start working on the project, it provides all the information the agent needs tomorrow or next week to pick up where we left off.
The agent is not my friend. I don't need it to remember my birthday or the nasty thing I said about React last week. I need it to document what anyone, agent or human, would need to know to get productive in a particular repo, with no previous knowledge of the project.
Good, concise, developer and user documentation and a plan with checklists solves every problem people seem to think "memory" will solve: It tells the agent what tech stack to use (we hashed it out in planning), it tells it what commands it needs to run and test the app, it covers the static analysis tools in use (which formalizes code style, etc. in a way a vague comment I made a month ago cannot), and it is cheap. Markdown files are the native tongue of agents. No MCP, no skills, no API needed. Just read the file. It works for any agent, any model, and any human just getting started with the project.
Basically, I think memory makes agents dumber and less useful. I want it to focus on the task at hand.
One problem I have is that now CLAUDE.md or skills tend to get version controlled within projects, I suspect they could get in the way sometimes.
There is already so much fatigue induced by these systems, adding another one willingly does sound crazy.
If we humans just did exactly what we did yesterday, what progress?
It's baked into the immutable constants of the universe for us; entropy, signal attenuation over distances... information breaks down over time.
Because of this all human social statistics trend towards zero with intentional conservatism. Progress is or collapse is all the universe affords. It doesn't seem interested in conservatism at all.
I tend to think developing with agents should look at lot like managing a human (like, I use feature-branch development with PRs and review them, even on my own projects that have no other devs and don't need a paper trail for security audit purposes), so I theoretically can get down with an issue based process, but thus far I haven't seen it done in a way that isn't just making busy work for agents.
https://github.com/Giancarlos/guardrails
Key things: I added a concept called "gates" which are tied to all tasks, it forces the agent to do arbitrary requirements such as: ensure it still runs / compiles, run all tests, ensure they pass, review existing tests critically and point out if they're not comprehensive enough, and finally, get human confirmation on the task. Until the human confirms, just work on another task and so on.
I didn't like that Beads was built on top of Git, I don't always work on git friendly projects, and beads kept getting messed up if I switched branches. So I made mine SQLite based. I also made it so you can sync to github issues, and sync pre-existing (and new) github issues as guardrails tasks to be worked on, the agent will even leave a comment for you on github when it grabs an issue in order to let others know the work will be done potentially.
me: "Hi AI, can you debug this SQL Statement?"
ai: "Well,based on your passion for garden hoses and extensive research of refrigerators, I'm going to guess you really want to discuss that"
If we didn't have evidence that these things cause something like psychosis in some people, it'd seem innocent. But, since the sycophancy combines with the long-term relationships some people think they're having with matrix math to trigger serious mental health problems, it feels more sinister.
Anyway, having a long-term memory makes them dumber and more easily confused. I don't have any use for a dumb agent.
It's obviously the latter, a system that 'remembers everything perfectly' is probably not optimal in most senses. Mortality is a property of both life and artificial systems, forcing the same retention policy on new information and old information probably does so at the expense of lifespan or stability.
What I do now is preserve all my claude code conversations and set the context from there.
This allows me to curate memory and it’s been the best way so far.
The other comment is that spatial memory is probably a better trigger for memory, so if you're not tracking where the coding session starts, the folders it's visits, etc, then you're not really providing a good associative footpath for the assistant to retrieve whats important for any given project.
You said it cuts token usage by 84% but isn't that typical for any typical chunked RAG system?
And why did you specifically chose to test against the LoMoCo dataset when there's a lot of issues with it and it being very easy to cheat?
https://en.wikipedia.org/wiki/Forgetting_curve