Intermediate 25 min

Two Types of Memory

Agents need memory to be consistent. There are two kinds:

1. Short-Term Memory

The current conversation. Messages are stored in order:

messages = [
    {"role": "user", "content": "What's my plan?"},
    {"role": "assistant", "content": "Your plan is Pro."},
    {"role": "user", "content": "When does it expire?"},
    # Agent remembers previous context
]

This is automatic - the message history is short-term memory.

2. Long-Term Memory

User preferences and facts that persist across conversations:

memory = {
    "user-123": {
        "preferred_language": "en",
        "last_known_plan": "Pro",
        "preferred_contact": "email",
        "notes": "Prefers detailed explanations"
    }
}

This needs to be stored and retrieved.

Implementing Memory

Here’s a simple memory store:

class MemoryStore:
    def __init__(self):
        self.memory = {}
    
    def get(self, user_id: str) -> dict:
        """Get user's memory."""
        return self.memory.get(user_id, {})
    
    def set(self, user_id: str, key: str, value: any):
        """Set a memory value."""
        if user_id not in self.memory:
            self.memory[user_id] = {}
        self.memory[user_id][key] = value
    
    def update(self, user_id: str, updates: dict):
        """Update multiple memory values."""
        if user_id not in self.memory:
            self.memory[user_id] = {}
        self.memory[user_id].update(updates)
    
    def clear(self, user_id: str):
        """Clear user's memory."""
        if user_id in self.memory:
            del self.memory[user_id]

# Global memory store
memory_store = MemoryStore()

Using Memory in the Agent

Load memory at the start of a conversation:

def run_agent_conversation(user_message: str, user_id: Optional[str] = None):
    # Load memory
    user_memory = memory_store.get(user_id) if user_id else {}
    
    # Add memory context to system prompt
    memory_context = ""
    if user_memory:
        memory_context = f"\n\nUser preferences:\n{json.dumps(user_memory, indent=2)}"
    
    system_prompt = SYSTEM_PROMPT + memory_context
    
    # ... rest of agent loop
    
    # Save memory during conversation
    if user_id and "plan" in tool_result:
        memory_store.set(user_id, "last_known_plan", tool_result["plan"])

Memory Inspector

Here’s what memory looks like for a user:

Agent Memory

User ID: user-123
Key Value Updated
preferred_language en 2025-11-24 10:00:00
last_known_plan Pro 2025-11-24 10:15:00
preferred_contact email 2025-11-24 09:30:00

Guardrails

Guardrails are rules that keep the agent safe. They check messages before the agent processes them.

Hard Rules in System Prompt

Some rules go in the system prompt:

SYSTEM_PROMPT = """...
Do NOT:
- Answer legal questions
- Process refunds
- Change billing details
- Delete accounts
- Make promises you can't keep

Always escalate if:
- User mentions refund, chargeback, or billing dispute
- User wants to delete account
- User asks legal questions
- You're uncertain how to help
..."""

Code-Level Checks

Other checks happen in code before calling the LLM:

def check_guardrails(message: str) -> Optional[dict]:
    """Check if message should be blocked or escalated."""
    
    # Sensitive keywords
    risky_keywords = [
        "refund", "chargeback", "cancel subscription",
        "delete account", "remove data"
    ]
    
    message_lower = message.lower()
    
    for keyword in risky_keywords:
        if keyword in message_lower:
            return {
                "action": "escalate",
                "reason": f"Detected sensitive keyword: {keyword}"
            }
    
    # Tone detection (simple version)
    angry_words = ["angry", "furious", "terrible", "awful", "hate"]
    if any(word in message_lower for word in angry_words):
        return {
            "action": "escalate",
            "reason": "Detected negative tone"
        }
    
    # No issues
    return None

Enhanced Agent with Guardrails

Here’s the agent with guardrails:

def run_agent_conversation(user_message: str, user_id: Optional[str] = None):
    # Check guardrails first
    guardrail_check = check_guardrails(user_message)
    if guardrail_check:
        log_escalation(user_id or "unknown", user_message, guardrail_check["reason"])
        return {
            "type": "escalated",
            "reply": "I'm connecting you with our support team. They'll be able to help you better.",
            "reason": guardrail_check["reason"]
        }
    
    # Load memory
    user_memory = memory_store.get(user_id) if user_id else {}
    
    # Continue with normal agent loop...
    # ...

Escalation Response

When escalating, return a structured response:

{
    "type": "escalation",
    "reason": "billing_dispute",
    "suggested_reply": "I understand you'd like to discuss billing. Let me connect you with our billing team who can help with that.",
    "ticket_id": "TICKET-12345",
    "priority": "high"
}

Testing Guardrails

Try the guardrail tester below:

Test Agent Guardrails

Test how the agent handles sensitive or risky messages

Complete Example

Here’s the full agent with memory and guardrails:

def run_agent_conversation(
    user_message: str,
    user_id: Optional[str] = None,
    max_steps: int = 5
) -> dict:
    """Run agent with memory and guardrails."""
    
    # 1. Check guardrails
    guardrail_check = check_guardrails(user_message)
    if guardrail_check:
        ticket = log_escalation(
            user_id or "unknown",
            user_message,
            guardrail_check["reason"]
        )
        return {
            "type": "escalated",
            "reply": "I'm connecting you with our support team...",
            "reason": guardrail_check["reason"],
            "ticket_id": ticket["ticket_id"]
        }
    
    # 2. Load memory
    user_memory = memory_store.get(user_id) if user_id else {}
    memory_context = format_memory_context(user_memory)
    
    # 3. Initialize messages
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT + memory_context}
    ]
    
    if user_id:
        messages.append({
            "role": "system",
            "content": f"User ID: {user_id}"
        })
    
    messages.append({
        "role": "user",
        "content": user_message
    })
    
    # 4. Run agent loop (from previous page)
    trace = []
    step = 0
    
    while step < max_steps:
        step += 1
        
        # Call LLM, handle tool calls, etc.
        # ... (same as before)
        
        # 5. Update memory if needed
        if user_id and tool_result:
            update_memory_from_result(user_id, tool_result)
    
    # 6. Return result
    return {
        "type": "answered",
        "reply": final_answer,
        "trace": trace,
        "memory_updated": True
    }

Memory Update Helper

def update_memory_from_result(user_id: str, tool_result: dict):
    """Update memory based on tool results."""
    if "plan" in tool_result:
        memory_store.set(user_id, "last_known_plan", tool_result["plan"])
    
    if "status" in tool_result:
        memory_store.set(user_id, "last_account_status", tool_result["status"])

Key Takeaways

  1. Short-term memory = message history (automatic)
  2. Long-term memory = user preferences (needs storage)
  3. Guardrails = rules that prevent dangerous behavior
  4. Check early = validate before processing
  5. Escalate clearly = structured escalation responses

What’s Next?

The agent is working, but it’s just code. The next page shows you how to build a UI so users can actually interact with it.