Mem0 vs Zep vs LangMem: Which Memory Tool Wins?
Mem0 locks graph queries behind $249/mo. Zep killed Community Edition. LangMem is free but LangGraph-only. Which one actually belongs in your stack?
Your agent just forgot the user’s name. Again. Not because your prompt is wrong — because you shipped without a memory layer, and now every session starts from zero.
But choosing a memory tool in 2026 is harder than it should be. Mem0 has 55K stars and a graph feature locked behind a $249/month paywall. Zep claims 200ms retrieval and best-in-class temporal reasoning but deprecated its Community Edition. LangMem is free and LangGraph-native — but only if you’re already bought into that ecosystem.
We tested the three against real production criteria: pricing architecture, retrieval quality, deployment flexibility, and framework lock-in. Here’s where each one wins — and where they fall short.
TL;DR: The 15-Second Verdict
| Criterion | Mem0 | Zep | LangMem |
|---|---|---|---|
| Best for | Teams wanting turnkey managed memory | Time-sensitive domains, compliance | LangGraph-native agents |
| Self-hostable | Yes (OSS) + managed | Graphiti OSS; full platform cloud-only | Yes (it’s a library) |
| Graph retrieval | Pro tier only ($249/mo) | Core feature, all tiers | None |
| Temporal reasoning | No | Yes (best-in-class) | No |
| SDKs | Python, JavaScript | Python, TypeScript, Go | Python only |
| Pricing floor | Free (10K memories) | Free (1K credits) | Free (OSS) |
| GitHub stars | ~55K | ~24K (Graphiti) | N/A (part of LangChain) |
| LongMemEval | 49.0% | N/A (Graphiti scores 71.2% on related benchmarks) | N/A |
If you’re prototyping and already on LangGraph: LangMem. If time matters and you need a managed platform: Zep. If you want the largest ecosystem and self-hosting: Mem0.
Why Agent Memory Is a Stack Decision, Not a Feature
Most teams treat memory as something they’ll “add later.” Then later arrives, and they discover memory isn’t a toggle — it’s infrastructure. You’re deciding:
- Where facts live. Vector DB? Knowledge graph? Both? What happens when facts contradict?
- How retrieval works. Semantic similarity alone misses structured relationships. Graph traversal without embeddings misses fuzzy matches.
- Who owns the pipeline. Do you maintain extraction, deduplication, and invalidation yourself, or does the platform handle it?
- How much it costs. Memory isn’t free. Every stored fact, every retrieval call, every embedding generation — it adds up at scale.
Our deep dive on agent memory systems covers the conceptual architecture. For the full stack picture, see our context engineering guide. Here, we’re comparing the tools.
Mem0: The Ecosystem Play
Mem0 is the most popular memory layer for AI agents — YC-backed (S24), ~55K GitHub stars, and integrations with OpenAI, LangGraph, CrewAI, and the Vercel AI SDK. If you’re using a mainstream framework, Mem0 probably has a drop-in integration.
What works
Memory compression that actually saves tokens. Mem0’s compression engine condenses chat history into optimized memory representations, claiming up to 80% prompt token reduction. In production, this matters: every token you don’t send is money saved.
Mem0g graph layer (2026). This year’s big addition: a knowledge graph that links entities and relationships across conversations. Entity resolution maps “Alice,” “alice@company.com,” and “the account owner” to the same node. Multi-hop queries traverse relationships. This is powerful for customer support, healthcare, or any domain with complex entity webs.
The self-hosting safety net. The OSS version (pip install mem0ai) gives you vector memory without touching the cloud. If your compliance requirements demand air-gapped deployment, Mem0 supports it.
from mem0 import Memory
m = Memory()
# Add a fact — Mem0 handles embedding, storage, and deduplication
m.add(
"User prefers dark mode and compact layout",
user_id="alice",
metadata={"source": "settings_change"}
)
# Retrieve relevant memories
results = m.search("What UI preferences does this user have?", user_id="alice")
What doesn’t
The $249 graph paywall. The knowledge graph — arguably Mem0’s most important production feature — requires the Pro tier at $249/month. The Standard tier ($19/month) gives you vector search only. For teams evaluating whether graph retrieval will meaningfully improve their agent, that’s a steep discovery cost.
Benchmark performance. On LongMemEval, a benchmark evaluating persistent memory across extended interactions, Mem0 scored 49.0% — below Hindsight (91.4%) and trailing Zep’s Graphiti on temporal reasoning tasks (source). For simple personalization (preferences, past topics), this is fine. For institutional knowledge, it may not be enough.
Pricing at scale. The free tier caps at 10K memories. The jump from $19 to $249/month has no intermediate step. Medium-scale production teams get squeezed.
Zep / Graphiti: When “When” Matters
Zep positions itself as a “context engineering platform,” and the distinction matters. Where Mem0 stores facts, Zep stores facts in time — tracking not just what’s true, but when it became true and when it stopped being true.
What works
Temporal knowledge graph. Graphiti, Zep’s open-source graph engine (~24K GitHub stars), models facts as temporal entities. You can query “Who owned the Q3 budget before the reorg?” or “What changed in the deployment pipeline after the March incident?” This isn’t a feature bolted on — it’s the core data model.
Bitemporal awareness. Zep tracks both valid time (when a fact was true in the real world) and transaction time (when the system learned it). This matters for compliance and audit trails — you can reconstruct what your agent knew at any point in time.
200ms retrieval at P95. For voice agents, video agents, or live support, latency kills the experience. Zep’s retrieval consistently lands under 200ms.
Three lines of integration.
from zep_python import ZepClient
client = ZepClient(api_key="z_...")
# Add messages; Zep extracts entities, facts, and relationships automatically
response = client.thread.add_messages(
thread_id="thread_123",
messages=[{"role": "user", "content": "I need to update the Q3 roadmap"}],
return_context=True
)
# Context arrives pre-assembled, token-optimized
print(response.context)
# Returns: user traits, relevant business data, recent interaction summary
What doesn’t
Community Edition is dead. Zep deprecated its CE, pushing users toward the managed cloud or self-hosting Graphiti directly. Graphiti is open source, but the full platform (with context assembly, entity resolution, and managed infrastructure) is cloud-only. If you need self-hosted everything, this is a friction point.
Graph-first trade-off. If your queries are primarily semantic (“What does this user like?”), a temporal knowledge graph adds complexity without proportional benefit. Zep shines when when is as important as what — for everyone else, it’s over-engineered.
Pricing opacity. Zep offers a free tier (1K credits) and paid plans starting at $25/month, but enterprise pricing requires contacting sales. For budget forecasting, that’s annoying.
LangMem: Free, Native, and Nowhere Else
LangMem is LangChain’s answer to agent memory — a Python library that adds persistent memory to LangGraph agents. It’s not a platform; it’s code you import.
What works
Zero cost, zero lock-in beyond LangGraph. If you’re already building with LangGraph (and our tutorial shows why you might), LangMem is the path of least resistance. It’s free. It’s open source. It runs in your infrastructure.
Tight LangGraph integration. LangMem hooks directly into LangGraph’s state graph and checkpointing system. Memory operations become nodes in your graph — you can add memory extraction after every tool call, or consolidate memories at session boundaries.
from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore
store = InMemoryStore()
# Create a memory manager that extracts and stores facts
manager = create_memory_store_manager(
store=store,
namespace=("memories", "{user_id}"),
memory_types=["semantic", "episodic"]
)
# After each interaction, extract memories
memories = await manager.aturn(
messages=[{"role": "user", "content": "I'm a backend engineer working in Go"}],
user_id="bob"
)
No separate infrastructure. LangMem uses your existing LangGraph store — Postgres, SQLite, or in-memory. No new API keys, no new services, no new bills.
What doesn’t
No knowledge graph. LangMem stores facts as structured JSON in a key-value-like store. Entity relationships? Multi-hop queries? You build those yourself. For complex domains, you’ll outgrow this quickly.
No temporal reasoning. Facts are stored and retrieved by semantic similarity. There’s no concept of “this was true until March.” If your facts change over time, you’re managing versioning manually.
Python only. LangGraph only. If you’re building with the OpenAI Agents SDK, Claude SDK, or CrewAI, LangMem isn’t an option. It’s a LangGraph feature, not a standalone memory layer.
Retrieval quality depends on you. LangMem handles extraction and storage. Retrieval — how you query memories and assemble context — is your code. For teams that want a fully managed pipeline, this is more work, not less.
Other Contenders Worth Mentioning
Hindsight (Vectorize.io). MIT-licensed, MCP-first, and scoring 91.4% on LongMemEval — nearly double Mem0’s score. Runs four parallel retrieval strategies (semantic, BM25, graph, temporal) on every query with a cross-encoder reranker. All features at every tier, including self-hosted. The catch: ~4K GitHub stars and a smaller community than Mem0 or Zep. If you’re evaluating fresh, don’t overlook it. (Source)
MemoClaw. Memory-as-a-Service with a wallet-based auth model (no API keys) and a free tier of 1K calls. Clean TypeScript and Python SDKs. Best for simple store/recall with zero setup — but no graph, no temporal reasoning, and cloud-only. Good for prototypes; less so for production complexity.
Letta (formerly MemGPT). OS-inspired tiered memory where agents manage their own memory — deciding what to remember, forget, and how to organize knowledge. ~21K GitHub stars, Apache 2.0. Powerful but it’s a full agent runtime, not a drop-in memory layer. Adopting Letta means adopting Letta’s architecture.
Decision Matrix: Which One Fits Your Stack?
| Your situation | Pick | Why |
|---|---|---|
| Building on LangGraph, want zero new infra | LangMem | Free, native, uses your existing store |
| Need temporal reasoning (compliance, audits) | Zep | Best-in-class bitemporal graph |
| Want largest ecosystem + self-hosting fallback | Mem0 | 55K stars, OSS option, broad integrations |
| Benchmark-obsessed, want best retrieval quality | Hindsight | 91.4% LongMemEval, multi-strategy retrieval |
| Quick prototype, don’t overthink it | LangMem or MemoClaw | Fastest path to “it remembers” |
| Multi-agent systems with shared memory | Mem0 (Pro) or Zep | Graph relationships across agent scopes |
Our Take
Here’s the uncomfortable truth: most agents don’t need a dedicated memory platform yet. If you’re in the first month of building, LangMem or a hand-rolled vector store will carry you further than you think. The pain points that justify Mem0 Pro or Zep — entity resolution at scale, temporal fact tracking, multi-hop graph queries — only appear once you have real users generating real interaction volume.
When those pain points do arrive, the choice crystallizes fast. Zep if time is a first-class dimension in your domain (compliance, project tracking, anything with audit trails). Mem0 if you need the broadest ecosystem and a self-hosting safety net. LangMem if you’re already deep in LangGraph and want memory that feels like part of the framework, not a separate service.
The mistake we see teams make is paying $249/month for Mem0 Pro before they’ve validated that graph retrieval actually improves their agent’s behavior. Start with the free tier. Run an A/B test: agent with graph memory vs. agent without. If the metrics move, upgrade. If they don’t, you just saved yourself a bill.
For more on how memory fits into the broader agent stack, read our multi-agent memory architecture patterns and the 2026 agent framework guide.
Last updated: May 21, 2026. Pricing and benchmarks sourced from vendor websites and independent comparisons as of this date. We’ll refresh when the landscape shifts.
Related Posts
Google ADK vs OpenAI vs Claude Agent SDK: The 2026 Three-Way Comparison
Google's ADK 2.0 ships graph workflows in four languages with native A2A. OpenAI added sandbox execution and three-tier guardrails. Claude offers the deepest MCP integration in the ecosystem. We built the same multi-step agent across all three — here's how they compare, where each one wins, and what you'll regret picking.
LangGraph vs OpenAI and Claude Agent SDKs Compared
LangGraph graphs, OpenAI handoffs, and Claude's MCP-native SDK — compared with code and a decision framework for 2026.
OpenAI Agents SDK vs Claude Agent SDK: 2026 SDK Showdown
OpenAI added sandboxes and subagents. Claude Agent SDK brings MCP, tool search, and streaming. We built with both — here's the verdict.