Memory & Context: Making Agents Remember
Add persistent memory to your AI agents — from Simple Memory for testing to PostgreSQL and Redis for production conversations.
🔄 In Lesson 4, your research agent could search the web, query Wikipedia, and synthesize findings. But ask it “what did I just ask you?” and it draws a blank. Every message starts from scratch. That’s the gap memory fills — and it’s the difference between a one-shot tool and a real assistant.
Why Memory Matters
Without memory, every workflow execution is isolated. The AI has no idea:
- What the user asked 30 seconds ago
- What it already answered
- What context the user provided earlier
For single-task workflows like the email classifier (Lesson 3), this is fine. But for anything conversational — chatbots, support agents, personal assistants — memory is essential. Research suggests about 70% of users building n8n AI workflows struggle with context retention. It’s the #1 source of frustration.
Memory Types in n8n
n8n offers four memory approaches, each with tradeoffs:
| Memory Type | Storage | Persistence | Queue Mode | Best For |
|---|---|---|---|---|
| Simple Memory | In-process RAM | ❌ Lost on execution end | ❌ Broken | Testing only |
| Window Buffer | In-process RAM | ❌ Lost on execution end | ❌ Broken | Testing with limited context |
| PostgreSQL | Database | ✅ Persistent | ✅ Works | Production chatbots |
| Redis | In-memory cache | ✅ Persistent (with config) | ✅ Works | High-throughput, real-time |
The progression: Start with Simple Memory for testing → switch to PostgreSQL for production → add Redis if you need sub-millisecond reads.
Let’s look at each one.
Simple Memory: Testing Only
Simple Memory stores the conversation history in the workflow’s runtime memory. It’s the fastest to set up — just attach it to your AI Agent and it works.
Why you should never use it in production:
- Conversation history disappears the moment the execution ends
- If n8n restarts, all memory is gone
- It does not work in queue mode — which is n8n’s recommended setup for handling concurrent users
Think of Simple Memory as a notepad that gets shredded after every conversation. Fine for testing your prompts. Useless for real users.
PostgreSQL Memory: The Production Default
PostgreSQL Memory stores conversations in a real database. It persists across restarts, supports concurrent access, and is queryable with SQL.
Setup:
- You need a PostgreSQL database (n8n Cloud includes one, or run your own)
- Add a Postgres Chat Memory sub-node to your AI Agent
- Configure the connection details (host, database, user, password)
- Set the Session ID — this is the key that groups messages into conversations
The session ID is critical. It tells n8n which conversation to load. A chat trigger typically provides this through {{ $json.sessionId }} or you can derive it from a user ID.
✅ Quick Check: Two users chat with your bot simultaneously. Both have their conversations stored in PostgreSQL Memory. How does the agent keep their conversations separate? (Answer: The session ID. Each user gets a unique session ID. When User A sends a message, the agent loads only User A’s conversation history from PostgreSQL. User B’s history stays separate. Without distinct session IDs, both users would share the same conversation — which would be confusing and a privacy issue.)
Redis Memory: For Speed
Redis Memory stores conversations in Redis — an in-memory data store that’s incredibly fast (sub-millisecond reads). It’s ideal for:
- High-throughput bots handling hundreds of concurrent conversations
- Real-time applications where latency matters
- Conversations that should auto-expire (Redis supports TTL — time-to-live)
Setup: Similar to PostgreSQL — add a Redis Chat Memory sub-node, provide your Redis connection, set the session ID. The key difference: you can set a TTL so conversations automatically expire after a set period (useful for support chats that don’t need permanent history).
For most workflows, PostgreSQL is the right choice. Redis is for when you’re optimizing performance at scale.
Window Buffer: Limiting Context
Window Buffer isn’t a storage type — it’s a strategy. It keeps only the last N messages in the context window instead of the full conversation history.
Why limit messages? LLMs have token limits. A conversation with 200 messages might exceed the model’s context window, causing errors or silent truncation. Window Buffer keeps the most recent messages (say, the last 20) so the agent always has recent context without hitting token limits.
You combine Window Buffer with a storage backend:
- Window Buffer + PostgreSQL = stores everything, loads last N messages
- Window Buffer + Redis = same idea, faster reads
Build: Chatbot with Persistent Memory
Let’s upgrade your research agent from Lesson 4 with PostgreSQL Memory.
Step 1: Start from Lesson 4’s Agent
Open the Multi-Tool Research Agent you built in Lesson 4 (Chat Trigger → AI Agent with tools).
Step 2: Add PostgreSQL Memory
- Click the AI Agent node
- Under Memory, add a Postgres Chat Memory sub-node
- Configure the connection to your PostgreSQL instance
- Set the Session ID Key:
{{ $json.sessionId }}
If you’re on n8n Cloud, you can use n8n’s built-in PostgreSQL. If self-hosted, connect to any PostgreSQL database.
Step 3: Update the System Prompt
Add memory-aware instructions to your system prompt:
You are a research assistant with conversation memory.
When a user asks a follow-up question:
- Reference information from earlier in the conversation
- Don't repeat information you've already provided
- If the user says "like I mentioned" or "as we discussed," check your memory
When greeting a returning user:
- Acknowledge the previous conversation if relevant
- Don't start from scratch each time
Step 4: Test Continuity
Click “Test workflow” and have a multi-turn conversation:
- “What’s the population of Japan?”
- “How does that compare to South Korea?”
- “Which one has a higher GDP per capita?”
Without memory, question 2 would fail — the agent wouldn’t know what “that” refers to. With PostgreSQL Memory, the agent loads the previous exchanges and understands the context.
Now close the chat and reopen it (using the same session ID). Ask: “What were we talking about?” The agent should recall your Japan/South Korea discussion — because the memory persists in the database.
✅ Quick Check: Your chatbot works in testing but forgets conversations in production. What’s the first thing to check? (Answer: The session ID. In testing, the Chat Trigger might generate a static session ID. In production, each user needs a consistent, unique session ID — typically derived from their user ID or auth token. If the session ID changes between messages or defaults to something random, the memory loads a fresh conversation every time.)
Memory and Token Costs
More memory means more tokens per request. Every message in the conversation history gets sent to the LLM as context. A 50-message conversation might add 5,000+ tokens to every subsequent request.
Cost management strategies:
- Window Buffer: Only load the last 10-20 messages instead of the full history
- Summary Memory: Periodically summarize old messages into a compact summary (requires a separate LLM call)
- Selective Memory: Only store messages that contain important context (user preferences, decisions) — skip pleasantries
For most use cases, a Window Buffer of 15-20 messages is the sweet spot between context awareness and cost.
Key Takeaways
- Simple Memory is for testing only — data disappears after execution and breaks in queue mode
- PostgreSQL Memory is the production default — persistent, concurrent, queryable
- Redis Memory is for high-throughput scenarios where sub-millisecond reads matter
- Window Buffer limits how many messages the agent loads — essential for managing token costs
- The session ID determines which conversation the agent loads — get this wrong and every message starts fresh
- Always update your system prompt to be memory-aware — tell the agent how to use conversation history
Up Next
Your agent remembers conversations, but it only knows what users tell it. What if it could answer questions from your company documents, PDFs, or knowledge base? In Lesson 6, you’ll build a RAG pipeline — embedding your documents in a vector store so the agent can retrieve relevant information and give answers grounded in your actual data.
Knowledge Check
Complete the quiz above first
Lesson completed!