Skip to main content
Explainer7 min read

Cross-session agent memory: making your agent remember without exploding the context

Cross-session memory is the architectural middle ground between amnesiac agents and ones that drown in their own history. Three flavours of memory, four hard problems, build vs buy options.

An agent that forgets you between sessions feels like talking to an amnesiac. An agent that remembers everything verbatim runs out of context in three turns. Cross-session memory is the architectural middle ground — and 2026 is the year it stops being optional.

Why memory is suddenly the differentiator

Until recently, "memory" in LLM apps meant "load the last N messages". That works for a single session. It collapses the moment a user expects continuity across days. Three forces converged in 2026:

  • Long-running agents (CS bots, research assistants, SRE copilots) became real products.
  • Users discovered they do not want to re-explain themselves every session.
  • Memory MCP servers (Mem0, Zep, the official memory MCP) made cross-session memory pluggable.

Three flavours of memory you actually need

Working memory

The current conversation, fits in the context window. No persistence, lives in the message array.

Episodic memory

"What we talked about last Tuesday." Stored as summarised events with timestamps. Retrieved by recency or topic match. Useful for "remind me what we decided about pricing".

Semantic memory

"What does the user prefer." Distilled facts: "user is a CTO at a B2B SaaS, prefers concise answers, hates emoji". Retrieved as a small bundle injected into every system prompt.

The shape of a working memory layer

A pragmatic stack looks like this:

System prompt
+ semantic facts (200 tokens)
-------------------------------
Recent messages (last 20 turns)
-------------------------------
Retrieved episodes (top 3 by relevance)

Total: ~3-5k tokens, regardless of how long the relationship has lasted.

The four hard problems

1. What to store

Storing every message bloats the index and dilutes retrieval. The fix: a memory writer agent that runs after each session and extracts only durable facts (preferences, decisions, constraints) and significant episodes (notable conversations).

2. When to forget

Memories should decay. A user address from 2024 may no longer be true. Patterns that work: timestamped facts with confidence scores, contradiction detection on write, periodic re-validation prompts.

3. How to retrieve without hallucination

Vector similarity gives plausibly-relevant results, not actually-relevant ones. Combine vector + keyword + metadata filters. Always show the model the source memory, not a paraphrase, so it can self-correct.

4. How to keep memory private

Cross-session memory IS persistent PII. GDPR/CCPA right-to-erasure means you need delete-by-user-id from day one. Encrypt at rest. Do not ship memories across tenant boundaries.

Build vs buy in 2026

OptionBest forTrade-off
Memory MCP (official)Single-user dev toolsKnowledge-graph model, manual schema
Mem0Multi-user productsHosted SaaS, vendor lock-in
ZepEnterprise, self-hostMore infra to run
Roll your own (Postgres + pgvector)Full controlSix months of edge-case work

A minimum viable memory in 50 lines

// after each session
const summary = await llm.complete({
  prompt: 'Extract durable facts and notable events from this conversation as JSON.',
  messages: session.messages,
});
await db.insert('memories', {
  user_id, summary,
  embedding: await embed(summary),
  created_at: now(),
});

// before each session
const recent = await db.query(
  'SELECT summary FROM memories WHERE user_id=$1 ORDER BY created_at DESC LIMIT 5',
  [user_id],
);
const relevant = await db.vectorSearch('memories', userQuery, { top: 3 });
const memoryBundle = [...recent, ...relevant]
  .map(m => m.summary)
  .join('\n');

Where this is heading

Expect three shifts over the next year: standardised memory schemas in MCP, native memory primitives in the Claude Agent SDK, and per-user memory dashboards letting users see and edit what the agent "knows" about them. The last one is non-negotiable for consumer products.

Related reads

Loadout

Build your AI agent loadout

Directory
Contact
© 2026 Loadout. Built on Angular 21 SSR.