claude-mem

A persistent-memory compression plugin for Claude Code: captures tool traces, creates searchable semantic summaries, and injects context across sessions.

29.7kTypeScriptGNU Affero General Public License v3.0

#typescript #sqlite #chromadb #persistent-memory #semantic-search #mcp #progressive-disclosure #memory-compression #claude-code-plugin #cursor-workflows #alternative-to-memorymd #developer-observability

What is it?

claude-mem treats the development session itself as a durable, searchable asset. It automatically captures Claude’s tool usage and key observations, compresses noisy traces into structured memory, and injects the right context when the next session starts. The design is hook-driven: turn lifecycle events into a stable event stream, persist it, then rebuild high-signal context through indexing plus semantic retrieval, while progressive disclosure keeps token cost predictable. By default it persists to SQLite, adds full-text search, and can optionally sync embeddings to Chroma for hybrid retrieval; a local web viewer helps you inspect the memory stream in real time. For teams, this converts scattered “what happened last week” knowledge into auditable engineering memory that survives reconnects and context resets.

Pain Points vs Innovation

✕Traditional Pain Points	✓Innovative Solutions
Across reconnects and multi-day work, critical context gets scattered across terminal output, issues, and personal notes, which is low-signal and hard to replay.	claude-mem captures lifecycle tool events reliably, compresses them into structured memory, and persists them so memory becomes queryable data instead of fragile chat residue.
A single memory file is either too short to keep details or too long to load safely every session, and without indexing it is painful to retrieve by symptom or task.	Hybrid retrieval combines full-text search with optional vectors, and progressive disclosure pulls only what fits the token budget; a local viewer makes the memory stream observable for tuning and debugging.

Architecture Deep Dive

Hook pipeline: turning sessions into an event stream

claude-mem is event-driven at its core: Claude Code lifecycle milestones and tool invocations are treated as discrete events captured by plugin hooks. This matters because it converts an ephemeral interaction into a persistent event stream that can be audited, replayed, and processed consistently. Events are written to storage first and processed asynchronously, so AI compression latency does not block the interactive coding loop. At injection time, the system prefers high-signal summaries over raw logs, keeping the context window dense and stable across sessions.

Dual indexing: FTS + vectors for hybrid retrieval

claude-mem separates retrieval into two complementary paths: exact keyword lookup and semantic similarity search. SQLite full-text search excels at finding commands, filenames, stack traces, and explicit keywords, while optional vectors enable intent-based recall when phrasing changes. Hybrid retrieval works because summaries increase signal density before indexing, which reduces noise in both FTS and embeddings. Progressive disclosure then returns context in layers so the model can expand scope only when needed under a token budget, and the viewer/search skills keep this pipeline observable instead of opaque.

Deployment Guide

1. Install the plugin inside Claude Code

bash

1> /plugin marketplace add thedotmack/claude-mem2> /plugin install claude-mem

2. Restart Claude Code to activate hooks

bash

1exit  # then reopen Claude Code

3. Review settings and tune context injection

bash

1cat ~/.claude-mem/settings.json

4. Query your project history via the mem-search skill

bash

1Ask a natural-language query, e.g., "What bugs did we fix last session?"

5. Open the local Viewer UI to inspect the memory stream (optional)

bash

1open http://localhost:37777

Use Cases

Core Scene	Target Audience	Solution	Outcome
Multi-day continuity	solo developers	capture tool traces and decisions, then inject the right summaries next session	less rework and faster context recovery
Team handoff & review	engineering leads	compress sessions into searchable memory and inspect quality via the viewer	cheaper handoffs and more repeatable regression
Agent workflow audit	platform engineers	persist tool calls and outcomes and use hybrid search to find abnormal paths	better observability, faster debugging, fewer mistakes

Limitations & Gotchas

This system is storage- and index-centric, so history will grow over time; plan retention and cleanup to avoid disk pressure from long sessions.
Semantic retrieval depends on embedding sync quality; enabling vectors can introduce write amplification, index bloat, and latency if not monitored.
It is primarily a Claude Code plugin, and compatibility with other clients depends on how much adapter work you are willing to maintain.

Frequently Asked Questions

What is the core difference between claude-mem and Claude Code’s MEMORY.md approach?▾

claude-mem externalizes memory as “queryable storage + layered retrieval”: it auto-captures tool events, compresses with AI, indexes, and injects context on demand. In contrast, MEMORY.md is a file-based convention where a fixed slice is loaded at session start and topic files are curated manually. The tradeoff is automation and searchability: claude-mem uses SQLite/FTS plus optional vectors and progressive disclosure to keep token costs controlled while staying highly retrievable, whereas MEMORY.md optimizes for human-maintained readability. If you care about replayable incident reproduction across many sessions, database-backed memory behaves more like infrastructure.

Why SQLite + FTS instead of appending summaries into one big document?▾

The issue with a big document is not storage, but retrieval, layering, and observability. SQLite lets you model sessions, observations, and summaries as indexable records, and FTS turns keyword recall into fast queries instead of scrolling and guesswork. It also creates hard boundaries for progressive disclosure: you can filter by time, session, or tool type before deciding what to inject. That keeps memory usable as volume grows, instead of becoming heavier and less reliable over time.

Do I have to enable Chroma vectors? When does it help?▾

No. Chroma is valuable for semantic recall when you remember intent but not exact keywords. Enable it if you frequently search across paraphrases or want to build reusable memory snippets for recurring problem patterns. If your dominant recall path is explicit signals like error codes, filenames, or commands, FTS is often sufficient and more predictable to operate.

What is it?

Pain Points vs Innovation

✕Traditional Pain Points	✓Innovative Solutions
Across reconnects and multi-day work, critical context gets scattered across terminal output, issues, and personal notes, which is low-signal and hard to replay.	claude-mem captures lifecycle tool events reliably, compresses them into structured memory, and persists them so memory becomes queryable data instead of fragile chat residue.
A single memory file is either too short to keep details or too long to load safely every session, and without indexing it is painful to retrieve by symptom or task.	Hybrid retrieval combines full-text search with optional vectors, and progressive disclosure pulls only what fits the token budget; a local viewer makes the memory stream observable for tuning and debugging.

Architecture Deep Dive

Hook pipeline: turning sessions into an event stream

Dual indexing: FTS + vectors for hybrid retrieval

Deployment Guide

1. Install the plugin inside Claude Code

bash

1> /plugin marketplace add thedotmack/claude-mem2> /plugin install claude-mem

2. Restart Claude Code to activate hooks

bash

1exit  # then reopen Claude Code

3. Review settings and tune context injection

bash

1cat ~/.claude-mem/settings.json

4. Query your project history via the mem-search skill

bash

1Ask a natural-language query, e.g., "What bugs did we fix last session?"

5. Open the local Viewer UI to inspect the memory stream (optional)

bash

1open http://localhost:37777

Use Cases

Core Scene	Target Audience	Solution	Outcome
Multi-day continuity	solo developers	capture tool traces and decisions, then inject the right summaries next session	less rework and faster context recovery
Team handoff & review	engineering leads	compress sessions into searchable memory and inspect quality via the viewer	cheaper handoffs and more repeatable regression
Agent workflow audit	platform engineers	persist tool calls and outcomes and use hybrid search to find abnormal paths	better observability, faster debugging, fewer mistakes

Limitations & Gotchas

This system is storage- and index-centric, so history will grow over time; plan retention and cleanup to avoid disk pressure from long sessions.
Semantic retrieval depends on embedding sync quality; enabling vectors can introduce write amplification, index bloat, and latency if not monitored.
It is primarily a Claude Code plugin, and compatibility with other clients depends on how much adapter work you are willing to maintain.

Frequently Asked Questions

What is the core difference between claude-mem and Claude Code’s MEMORY.md approach?▾

Why SQLite + FTS instead of appending summaries into one big document?▾

Do I have to enable Chroma vectors? When does it help?▾