Video summary

Every Claude Code Memory System Compared (So You Don't Have To)

Main summary

Key takeaways

Technology

Summary of technological concepts and memory systems (Claude Code)

The video compares multiple “Claude Code memory” approaches and frames them as different ways to answer the same core question:

When Claude Code gets a task, how does it pull the right context at the right time?

Each level mainly differs by:

  1. Where memory lives (file storage / databases / local vs cloud)
  2. How Claude retrieves it (automatic injection vs search vs verbatim RAG vs cross-tool retrieval)

The creator’s motivation is building an agentic operating system (“business brain”) that works reliably and scales without bloating context.


Level 1 — Native Claude Code memory (file-based, manual correctness)

Goal: Persist instructions and lightweight memory using what ships with Claude Code.

Key features

  • claude.md: Project-level markdown “rules” loaded into every terminal session (acts like a system prompt).

    • Recommended: don’t stuff it with everything
    • Warns about context rot (LLMs failing to recall earlier loaded context as total context grows)
    • Rule of thumb: keep claude.md under 200 lines
    • For large docs (e.g., brand voice), store them in separate referenced files rather than a single giant claude.md.
  • memory.md + auto memory

    • Using /memory and opening the auto memory folder reveals an indexing structure per project.
    • Instead of dumping everything into one file, Claude auto-generates:
      • an index (memory.md)
      • separate memory docs (e.g., project feedback split into topic-specific files)
    • Notes: if a folder doesn’t exist/needs memory yet, it may not create memory.md until there’s activity.

Context about Anthropic roadmap

  • References a leaked/unreleased internal “Kairos” concept: an always-on background daemon that watches projects and consolidates memory over time (especially while you sleep) to reduce context rot.
  • Implies native memory systems should improve over time.

Level 2 — More reliable structure + auto-injection via session hooks

Goal: Improve reliability of feeding memory at the right moment and keep file structure from becoming too monolithic.

Main contribution: structured persistent memory management

  • Builds on a community concept (John Connolly) that organizes memory into a structured layout and injects it into sessions automatically.
  • Describes a folder/file hierarchy including:
    • global general.md (cross-project facts/preferences/setup)
    • topic/domain files (one per domain/topic)
    • tool configs (one file per tool, e.g. slack.md)
    • indexing via memory.md pointing to those files

Automation mechanism

  • Uses a Claude Code session start hook:
    • Instead of Claude manually reading memory files, the hook ensures the memory index is injected before tool calls (e.g., via scripts like pre-tool memory.sh).
  • Emphasizes teammate sharing:
    • Sync domain/tool memory files in a shared folder so multiple teammates can contribute to and use the same structured knowledge.

“Reorganize memory” maintenance workflow

  • Demonstrates running prompts in plan mode to:
    • detect duplicates/outdated entries
    • delete empty/dummy memory files
    • trim stale sessions
    • resolve stale/unfinished threads
    • add cross-references and reorganize indexes
  • Claims measurable cleanliness improvements (e.g., deleted empty files, trimmed sessions, linked open threads, updated indexes).

Scaling warning (why you’d move to Level 3)

  • As memory grows for months across multiple projects/clients, keyword/topic summaries inside large index files (like general.md) can:
    • become inefficient to read via keyword search
    • degrade retrieval effectiveness (search “falls apart”).

Level 3 — Semantic retrieval with MemSearch (OpenClaude-style memory + vector search)

Goal: Fix retrieval scaling using semantic search and automatic injection of relevant matches.

Core framework: MemSearch

  • Ports an architecture pattern from OpenClaude (standalone agent pattern) and implements it as a Claude Code plugin.
  • Credited vendor/background: Zilliz (team behind a popular vector DB).

File structure and time horizons

  • Keeps a markdown-first philosophy and OpenClaude-like layout:
    1. memory.md: durable long-term facts/preferences (loaded every session)
    2. daily notes: one file per date (recent context auto-loaded; older notes not loaded into context)
    3. optional “dreaming” concept:
      • a background process reads daily notes
      • scores repeated items
      • promotes recurring info to memory.md
      • forgets stale stuff so context doesn’t blow up

Retrieval + injection mechanism

  • Documents are chunked into semantic vectors.
  • Uses a hook on user prompt submit:
    • automatically injects the top matches (e.g., top 3) into the prompt context
    • avoids requiring the user to manually ask Claude to “search the notes”

Installation / verification steps (tutorial elements)

  • Installation via Claude Code plugin marketplace (two-line install).
  • After enabling, run /reloadplugins.
  • Verification includes:
    • checking created memory files (e.g., directory like .memsearch/memoryfiles)
    • running a memory recall/search skill or using a slash command like memory recall
    • starting conversations to seed daily note files.

Contrast with another plugin: “Claude Mem”

  • Mentions Claude Mem as an alternative plugin that captures/compresses what Claude does and re-injects it later.
  • Differences:
    • MemSearch uses markdown + injection from a local readable structure.
    • Claude Mem uses MCP tools, requiring Claude to actively call tools for retrieval.
    • Claude Mem is positioned as more feature-heavy (dashboards, collaboration, cost tracking, privacy labels), but potentially overkill.

Level 4 — Verbatim, high-accuracy conversation recall with Mem Palace (local RAG)

Goal: Retrieve the exact wording from earlier decisions/conversations.

Key properties

  • “Proper” RAG with the highest benchmark claim (as stated by the website).
  • Stores conversation info locally, targeting word-for-word recall:
    • avoids summarization loss (because verbatim content is stored)

How it works (storage model)

  • Uses a “memory palace” metaphor:
    • wings / rooms / drawers nested structure
    • symbolic indexing language AAAK that lets an LLM scan many drawers quickly
  • Uses two database types:
    • SQL database for entities/relationships
    • Chroma vector DB for searchable chunks (conversation data)
  • Background hooks (on session end / compaction) silently file and index memories.

Installation/tutorial elements

  • One-command install + initialize.
  • Registers hooks in settings.json.
  • Can “mine” older sessions to backfill the palace.
  • Downside: stored content may not be “readable markdown” directly, but is retrievable quickly.

Level 5 — Build interconnected “second brain” / living wiki (LLM Wiki / Obsidian workflows)

Goal: For deep research and long-lived interconnected knowledge—beyond operational conversation recall.

Primary recommendation: Karpathy-style LLM Wiki

  • Cites Andrej Karpathy’s LLM Wiki concept.
  • Workflow:
    • raw/: drop source documents (articles, transcripts, PDFs, etc.)
    • wiki/: Claude writes and maintains the living wiki with cross-links
    • output is plain markdown, no external vector DB required (as described)
  • Mentions an Obsidian knowledge graph experience as the visualization layer.

When it’s worth it

  • Best for topics you revisit and want deep cross-referenced research on.
  • The video’s creator notes it may not match their “business OS / operational memory” retrieval needs as well as other levels.

Alternative: Recall (hosted)

  • “Recall” is a hosted version of LLM Wiki with a browser extension.
  • Downsides highlighted:
    • ownership concerns (data stays on their servers)
    • more about content consumption than “what did we decide operationally?”
    • pricing considerations

Mention: LightRAG

  • Described as enterprise/heavier knowledge-graph retrieval, likely overkill for most business owners.
  • Positioned as research/building a KB rather than operational memory.

Level 6 — Cross-tool shared memory via a portable external brain (Open Brain / Postgres)

Goal: Make memory shared across multiple AI tools (ChatGPT, Claude Code, Cursor, etc.) in real time.

Primary system: Open Brain (Nate Jones)

  • Intent:
    • memory stored in user-owned Postgres
    • portability: connect new AI tools later to the same DB
  • Setup described:
    • one table: thoughts with text chunks, embeddings, tags, timestamps
    • semantic search via Postgres extensions
    • an MCP server as the front door for tools to query memory via hosted edge functions (Superbase described)

Tradeoffs

  • Setup is more complex and slower to understand than earlier local solutions.
  • Cost/latency considerations:
    • external DB queries add latency
    • hosted service costs (range mentioned: ~10–30 cents/month on free tier; also notes OpenBrain as “less than a dollar/month”)

Tutorial/next steps

  • Points to a setup guide and Nate’s walkthrough.
  • Mentions companion prompts for migrating from existing Claude memory systems.

Alternative: Mem0

  • Cross-tool memory layer used by many developers.
  • Positives: quick setup (<1 minute), production-friendly.
  • Downsides:
    • memory hosted on their servers (ownership/control concerns)
    • less aligned with “own and export everything” portability goals

Practical “which level to pick” guidance (decision summary)

  • Start: Level 1 (claude.md + memory.md) for immediate improvement (~10 minutes).
  • After some usage: Level 2 (John’s hook) is often enough.
  • If context grows for months and recall/search fails: Level 3 (MemSearch) or Level 4 (Mem Palace for verbatim).
  • For deep research / linking knowledge: Level 5 (LLM Wiki / Obsidian).
  • For cross-tool, portable operational memory: Level 6 (Open Brain / Postgres).

Compatibility / stacking

  • Creator claims many levels can stack together:
    • Level 1 + 2 + 3 are compatible and can share similar folder structure / integrate with Claude Code.

Main speakers / sources

  • Main speaker: the video author/creator (host of “Every Claude Code Memory System Compared…”)
  • Referenced sources/frameworks:
    • John Connolly (memory structure + prompt/hook concept)
    • Pavel Huryn (concept adapted/credited via Connolly’s write-up)
    • OpenClaude (inspiration for Level 3 memory architecture)
    • Zilliz (MemSearch plugin / vector-search background)
    • Mem Palace (Mem Palace framework for local verbatim retrieval)
    • Andrej/André Karpathy (LLM Wiki concept referenced for Level 5)
    • Nate Jones (Open Brain described for Level 6)
    • Mem0 (alternative cross-tool memory layer)
    • Anthropic (mentioned regarding native memory improvements and the leaked “Kairos” idea)

Original video