Replacing a half-dead dual memory system with a single file-native one. Markdown is the source of truth; every index or database is a disposable cache. Boring first, benchmarked, staged, private, reversible.
Hermes Agent (MaxHermes / MiniMax, the Telegram self-improving-skills agent) is declined, evaluated 20+ times and not adopted. Its one good idea, an agent that writes its own reusable skills from completed work, survives as Phase 8: deferred to the very end, built file-native (Claudeception / UniM0cha patterns), and only after the boring core proves out on the benchmark. The product is rejected; the capability is parked at the back of the line where it cannot do damage.
Every database, index, or cache is a disposable, rebuildable copy. If a cache disappears we lose speed, never knowledge. This is the rule whose violation killed the Postgres brain (recall returned nothing, 0 graph links from 9,315 jobs, only 35 of 378 memories ever read). It stays at the top.
Called the architecture "a necessary correction" and validated the single-source-of-truth pivot. Key pushes, all folded in: do not build the vector index for 23 docs (grep + index wins per Karpathy's 50-to-500 rule); staging must be permanent, not a phase; mirror-on-write, not symlinks (Obsidian Sync breaks symlinks); and private financial figures must never be summarized into the synced wiki. Verified the 10,000-char injection cap, async hook support (2.1.0), and that vstash is real.
"Approve the architecture, reject the complexity of v1." Independently reached the same cuts as Gemini, and added the sharpest single fix: SessionStart runs before the prompt exists, so passive recall must split into SessionStart (stable context) plus UserPromptSubmit (prompt-aware grep). Also: a recall benchmark is the missing piece; use a scheduled fold worker, not async-from-Stop; make /memory-review a real command; and an explicit no-vector-until-benchmark-fails gate. All folded in.
"One of the few AI-memory plans I've read that doesn't immediately reach for a vector database and a graph and three frameworks before it has a single user. You killed a Postgres brain and replaced it with markdown files and grep. That's the right direction." The single-source-of-truth rule at the top is "the whole ballgame," and the no-vector-until-the-benchmark-fails gate is "exactly the discipline almost everyone skips. 23 docs. You can grep 23 docs faster than you can import the embedding library."
Where it still over-builds: Phase 4 is heavier than it needs to be. NDJSON capture, a launchd worker, a lock file, staging and rejected dirs, a review panel, all before a single fact has been auto-folded. "Collapse it: the Stop hook appends a candidate note to one _staging/inbox.md. That's it. The cron worker is solving a throughput problem you do not have." Also: drop session_context.py as a separate hook, let UserPromptSubmit do all the recall with a no-prompt-yet branch.
On safety: staging plus human review is correct, "and I'd go further. Keep the human in that loop indefinitely for anything load-bearing, not just the first 20. A wrong fact re-injected as trusted context is the worst failure mode in the whole system and it's silent."
Bottom line: "Ship Phases 1-3, gut half of Phase 4, and you'll have a memory system better than 95% of the ones with a Series A."
Simulated, channeling Karpathy's known positions on LLMs, context, and markdown knowledge bases. Not an actual review by him.
"90% of AI-memory plans die because they're built for the architecture diagram, not for the human at 7am who just wants their tool to remember stuff. This one gets it." Two things stood out: passive recall ("memory surfaces without you asking, no can-you-check-my-notes dance, that's the single biggest unlock for a busy operator") and /memory-review as a control panel ("you're not trusting a robot to silently rewrite your source of truth, you get a queue, you approve or kill. That's how you build trust with a non-technical user").
The worry: "Eight phases, launchd workers, NDJSON queues, lock files, Docker shims. For a solo operator this is a LOT of moving parts. My worry isn't the build, it's month three, when a launchd job silently dies and nobody notices the folds stopped." The fix he wants: a dead-simple health signal, one line in the daily note, "memory captured 4 things yesterday, 2 waiting for your review." If the system goes quiet, you should feel it.
On Obsidian: vault-as-truth is "100% correct," and the symlink-to-mirror fix is "a scar, not a theory." Borrow from kepano's obsidian-skills: "lean harder on aliases and frontmatter for recall routing before you ever think about vectors." Two adoption nudges: a 2-minute onboarding that shows one real recall happening live, and default /memory-review to a gentle morning-note nudge so reviewing becomes a habit.
Bottom line: "Boring-first, benchmarked, and reversible, this is the rare memory system a real operator will actually stick with. Build it."
Simulated, channeling Chase's practical Obsidian + Claude Code creator voice. Not an actual review by him.
Gemini and GPT validated the architecture. Karpathy and ChaseAI, from opposite ends (minimalist builder vs everyday operator), independently land on the same two changes: (1) collapse Phase 4, drop the NDJSON queue, launchd worker, and lock file in favor of appending candidates to one staging file you review when you review; and (2) add a health signal, one line in the daily note showing what memory captured and what is waiting, so a silently-dead worker can never strand folds the way the old system stranded everything. Both are easy folds into the plan.