Memory & Context: Making Agents Remember

🔄 In Lesson 4, your research agent could search the web, query Wikipedia, and synthesize findings. But ask it “what did I just ask you?” and it draws a blank. Every message starts from scratch. That’s the gap memory fills — and it’s the difference between a one-shot tool and a real assistant.

Why Memory Matters

Without memory, every workflow execution is isolated. The AI has no idea:

What the user asked 30 seconds ago
What it already answered
What context the user provided earlier

For single-task workflows like the email classifier (Lesson 3), this is fine. But for anything conversational — chatbots, support agents, personal assistants — memory is essential. Research suggests about 70% of users building n8n AI workflows struggle with context retention. It’s the #1 source of frustration.

Memory Types in n8n

n8n offers four memory approaches, each with tradeoffs:

Memory Type	Storage	Persistence	Queue Mode	Best For
Simple Memory	In-process RAM	❌ Lost on execution end	❌ Broken	Testing only
Window Buffer	In-process RAM	❌ Lost on execution end	❌ Broken	Testing with limited context
PostgreSQL	Database	✅ Persistent	✅ Works	Production chatbots
Redis	In-memory cache	✅ Persistent (with config)	✅ Works	High-throughput, real-time

The progression: Start with Simple Memory for testing → switch to PostgreSQL for production → add Redis if you need sub-millisecond reads.

Let’s look at each one.

Simple Memory: Testing Only

Simple Memory stores the conversation history in the workflow’s runtime memory. It’s the fastest to set up — just attach it to your AI Agent and it works.

Why you should never use it in production:

Conversation history disappears the moment the execution ends
If n8n restarts, all memory is gone
It does not work in queue mode — which is n8n’s recommended setup for handling concurrent users

Think of Simple Memory as a notepad that gets shredded after every conversation. Fine for testing your prompts. Useless for real users.

PostgreSQL Memory: The Production Default

PostgreSQL Memory stores conversations in a real database. It persists across restarts, supports concurrent access, and is queryable with SQL.

Setup:

You need a PostgreSQL database (n8n Cloud includes one, or run your own)
Add a Postgres Chat Memory sub-node to your AI Agent
Configure the connection details (host, database, user, password)
Set the Session ID — this is the key that groups messages into conversations

The session ID is critical. It tells n8n which conversation to load. A chat trigger typically provides this through {{ $json.sessionId }} or you can derive it from a user ID.

✅ Quick Check: Two users chat with your bot simultaneously. Both have their conversations stored in PostgreSQL Memory. How does the agent keep their conversations separate? (Answer: The session ID. Each user gets a unique session ID. When User A sends a message, the agent loads only User A’s conversation history from PostgreSQL. User B’s history stays separate. Without distinct session IDs, both users would share the same conversation — which would be confusing and a privacy issue.)

Redis Memory: For Speed

Redis Memory stores conversations in Redis — an in-memory data store that’s incredibly fast (sub-millisecond reads). It’s ideal for:

High-throughput bots handling hundreds of concurrent conversations
Real-time applications where latency matters
Conversations that should auto-expire (Redis supports TTL — time-to-live)

Setup: Similar to PostgreSQL — add a Redis Chat Memory sub-node, provide your Redis connection, set the session ID. The key difference: you can set a TTL so conversations automatically expire after a set period (useful for support chats that don’t need permanent history).

For most workflows, PostgreSQL is the right choice. Redis is for when you’re optimizing performance at scale.

Window Buffer: Limiting Context

Window Buffer isn’t a storage type — it’s a strategy. It keeps only the last N messages in the context window instead of the full conversation history.

Why limit messages? LLMs have token limits. A conversation with 200 messages might exceed the model’s context window, causing errors or silent truncation. Window Buffer keeps the most recent messages (say, the last 20) so the agent always has recent context without hitting token limits.

You combine Window Buffer with a storage backend:

Window Buffer + PostgreSQL = stores everything, loads last N messages
Window Buffer + Redis = same idea, faster reads

Build: Chatbot with Persistent Memory

Let’s upgrade your research agent from Lesson 4 with PostgreSQL Memory.

Step 1: Start from Lesson 4’s Agent

Open the Multi-Tool Research Agent you built in Lesson 4 (Chat Trigger → AI Agent with tools).

Step 2: Add PostgreSQL Memory

Click the AI Agent node
Under Memory, add a Postgres Chat Memory sub-node
Configure the connection to your PostgreSQL instance
Set the Session ID Key: {{ $json.sessionId }}

If you’re on n8n Cloud, you can use n8n’s built-in PostgreSQL. If self-hosted, connect to any PostgreSQL database.

Step 3: Update the System Prompt

Add memory-aware instructions to your system prompt:

You are a research assistant with conversation memory.

When a user asks a follow-up question:
- Reference information from earlier in the conversation
- Don't repeat information you've already provided
- If the user says "like I mentioned" or "as we discussed," check your memory

When greeting a returning user:
- Acknowledge the previous conversation if relevant
- Don't start from scratch each time

Step 4: Test Continuity

Click “Test workflow” and have a multi-turn conversation:

“What’s the population of Japan?”
“How does that compare to South Korea?”
“Which one has a higher GDP per capita?”

Without memory, question 2 would fail — the agent wouldn’t know what “that” refers to. With PostgreSQL Memory, the agent loads the previous exchanges and understands the context.

Now close the chat and reopen it (using the same session ID). Ask: “What were we talking about?” The agent should recall your Japan/South Korea discussion — because the memory persists in the database.

✅ Quick Check: Your chatbot works in testing but forgets conversations in production. What’s the first thing to check? (Answer: The session ID. In testing, the Chat Trigger might generate a static session ID. In production, each user needs a consistent, unique session ID — typically derived from their user ID or auth token. If the session ID changes between messages or defaults to something random, the memory loads a fresh conversation every time.)

Memory and Token Costs

More memory means more tokens per request. Every message in the conversation history gets sent to the LLM as context. A 50-message conversation might add 5,000+ tokens to every subsequent request.

Cost management strategies:

Window Buffer: Only load the last 10-20 messages instead of the full history
Summary Memory: Periodically summarize old messages into a compact summary (requires a separate LLM call)
Selective Memory: Only store messages that contain important context (user preferences, decisions) — skip pleasantries

For most use cases, a Window Buffer of 15-20 messages is the sweet spot between context awareness and cost.

Key Takeaways

Simple Memory is for testing only — data disappears after execution and breaks in queue mode
PostgreSQL Memory is the production default — persistent, concurrent, queryable
Redis Memory is for high-throughput scenarios where sub-millisecond reads matter
Window Buffer limits how many messages the agent loads — essential for managing token costs
The session ID determines which conversation the agent loads — get this wrong and every message starts fresh
Always update your system prompt to be memory-aware — tell the agent how to use conversation history

Up Next

Your agent remembers conversations, but it only knows what users tell it. What if it could answer questions from your company documents, PDFs, or knowledge base? In Lesson 6, you’ll build a RAG pipeline — embedding your documents in a vector store so the agent can retrieve relevant information and give answers grounded in your actual data.