AI Agent Memory: Complete Guide to Persistence Solutions
AI agents are stateless by default. Every request starts fresh with no memory of previous interactions. This is a fundamental limitation—but it's solvable.
This guide covers every approach to giving agents persistent memory, from simple conversation buffers to enterprise-grade control planes.
The Memory Hierarchy
| Memory Type | Persistence | Best For | Example |
|---|---|---|---|
| Conversation Buffer | Session only | Simple chatbots | LangChain BufferMemory |
| Summary Memory | Session only | Long conversations | LangChain SummaryMemory |
| Vector Store | Persistent | Semantic retrieval | Pinecone, Weaviate |
| Database Memory | Persistent | Structured data | Redis, PostgreSQL |
| Control Plane | Persistent | Multi-agent systems | AgentMemo |
Memory Types Explained
1. Conversation Buffer Memory
The simplest approach: store all messages and pass them as context.
# LangChain example
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
memory.save_context({"input": "Hi"}, {"output": "Hello!"})
memory.save_context({"input": "What's my name?"}, {"output": "I don't know yet."})
# Retrieves all messages as context
history = memory.load_memory_variables({})
# {'history': 'Human: Hi\nAI: Hello!\nHuman: What\'s my name?\nAI: I don\'t know yet.'}
✓ Pros
- Simple to implement
- Perfect recall
- No data loss
✗ Cons
- Context window limit
- Expensive (more tokens)
- Session only
2. Summary Memory
Summarize older messages to fit more context in the window.
from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(llm=llm)
# As conversation grows, it summarizes older parts
# "The human introduced themselves as John and asked about..."
✓ Pros
- Handles long conversations
- Smaller context needed
✗ Cons
- Lossy compression
- Extra LLM calls
- Still session only
3. Vector Store Memory
Store messages as embeddings, retrieve relevant ones via semantic search.
from langchain.memory import VectorStoreRetrieverMemory
from langchain.vectorstores import Pinecone
# Store conversations as vectors
vectorstore = Pinecone.from_existing_index("conversations", embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
memory = VectorStoreRetrieverMemory(retriever=retriever)
# Retrieves semantically relevant messages
# Query: "What did we discuss about billing?"
# Returns: The 5 most relevant messages about billing
✓ Pros
- Persistent across sessions
- Scales to millions of messages
- Semantic retrieval
✗ Cons
- Can miss relevant context
- Embedding costs
- Setup complexity
4. Database Memory
Store structured data in a traditional database.
import redis
import json
r = redis.Redis()
def save_state(user_id, state):
r.set(f"agent_state:{user_id}", json.dumps(state))
r.expire(f"agent_state:{user_id}", 86400 * 30) # 30 day TTL
def get_state(user_id):
data = r.get(f"agent_state:{user_id}")
return json.loads(data) if data else {}
# Store structured state, not just messages
save_state("user_123", {
"name": "John",
"preferences": {"language": "en"},
"open_tickets": ["ticket_456"],
"conversation_summary": "Discussed billing issue..."
})
✓ Pros
- Fully persistent
- Fast reads/writes
- TTL support
- Structured data
✗ Cons
- Manual schema design
- No semantic search
- DIY everything
5. Control Plane Memory
Purpose-built infrastructure for agent state, workflows, and coordination.
import agentmemo
# Register agent
agent = agentmemo.agents.register({
"name": "support-agent",
"framework": "langchain"
})
# Store structured state (persists forever)
agentmemo.state.write({
"agent_id": agent.id,
"component": "customer_context",
"key": "user_123",
"value": {
"name": "John",
"history_summary": "Long-standing customer, previous billing issues resolved",
"preferences": {"channel": "email", "language": "en"},
"current_issue": {"ticket_id": "456", "status": "investigating"}
}
})
# Store workflow (any agent can execute)
agentmemo.workflows.create({
"name": "billing-support",
"definition": workflow_markdown,
"designed_by": "opus"
})
# Handoff to another agent
agentmemo.handoffs.create({
"to_agent": "specialist-agent",
"context": state,
"workflow": "billing-support"
})
✓ Pros
- Built for agents
- Workflows + state + handoffs
- Framework agnostic
- Audit trail
✗ Cons
- Additional service
- Learning curve
Choosing the Right Memory Type
| Use Case | Recommended Memory |
|---|---|
| Simple chatbot, single session | Conversation Buffer |
| Long conversations, single session | Summary Memory |
| Knowledge base Q&A | Vector Store |
| User preferences & state | Database (Redis/Postgres) |
| Multi-agent workflows | Control Plane |
| Cross-session continuity | Database or Control Plane |
| Agent handoffs | Control Plane |
| Model downgrade (Opus → Haiku) | Control Plane |
Hybrid Approaches
In practice, most production agents use multiple memory types:
class HybridMemoryAgent:
def __init__(self):
# Short-term: conversation buffer
self.conversation = ConversationBufferMemory()
# Medium-term: vector store for semantic retrieval
self.knowledge = VectorStoreRetrieverMemory(...)
# Long-term: control plane for state and workflows
self.state = agentmemo.state
self.workflows = agentmemo.workflows
async def process(self, user_id, message):
# Load persistent state
user_state = await self.state.read(user_id)
# Get relevant knowledge
relevant = await self.knowledge.retrieve(message)
# Add to conversation
self.conversation.add_message(message)
# Process with full context
response = await self.llm.complete(
context={
"user_state": user_state,
"relevant_knowledge": relevant,
"conversation": self.conversation.messages
}
)
# Update persistent state
await self.state.write(user_id, updated_state)
return response
Memory Best Practices
1. Separate What from Why
Store structured state (what the agent knows) separately from conversation logs (what was said).
2. Use Appropriate TTLs
Not everything needs to live forever. Set expiration for temporary state.
3. Summarize Before Storing
Store "Customer John has billing concerns about invoice #456" not 50 raw messages.
4. Plan for Handoffs
Memory should be complete enough that a different agent can pick up the work.
5. Audit State Changes
Log when state is read/written for debugging and compliance.
Ready for Production-Grade Agent Memory?
AgentMemo provides persistent state, workflows, and handoffs out of the box.
Start Free Trial →