Dec 19, 2025

Intermediate 25 min

What You’ve Built

You’ve created a complete AI agent with:

✅ Agent loop - Receives, decides, acts, repeats
✅ Tool calling - Three tools for FAQ, subscription, escalation
✅ Memory - Short-term and long-term storage
✅ Guardrails - Safety checks and escalation
✅ Debug UI - Chat, trace viewer, memory inspector

This is a solid foundation. Now let’s explore how to extend it.

Suggested Extensions

1. Add More Tools

Create Ticket Tool:

def create_support_ticket(user_id: str, subject: str, description: str) -> dict:
    """Create a support ticket."""
    ticket_id = f"TICKET-{generate_id()}"
    # Save to database
    return {"ticket_id": ticket_id, "status": "created"}

Update Subscription Tool:

def update_subscription_plan(user_id: str, new_plan: str) -> dict:
    """Update user's subscription plan."""
    # Validate plan exists
    # Update in database
    # Send confirmation email
    return {"success": True, "new_plan": new_plan}

2. Add Rate Limiting

Prevent abuse:

from collections import defaultdict
from datetime import datetime, timedelta

rate_limits = defaultdict(list)

def check_rate_limit(user_id: str, max_requests: int = 10, window: int = 60) -> bool:
    """Check if user has exceeded rate limit."""
    now = datetime.now()
    user_requests = rate_limits[user_id]
    
    # Remove old requests
    user_requests[:] = [
        req_time for req_time in user_requests
        if now - req_time < timedelta(seconds=window)
    ]
    
    if len(user_requests) >= max_requests:
        return False
    
    user_requests.append(now)
    return True

3. Connect to Real Database

Replace mock data:

import psycopg2

def get_subscription_status(user_id: str) -> dict:
    """Get subscription from real database."""
    conn = psycopg2.connect(DATABASE_URL)
    cursor = conn.cursor()
    
    cursor.execute(
        "SELECT plan, status, expires FROM subscriptions WHERE user_id = %s",
        (user_id,)
    )
    
    row = cursor.fetchone()
    if not row:
        return {"error": "User not found"}
    
    return {
        "plan": row[0],
        "status": row[1],
        "expires": row[2].isoformat()
    }

4. Add Semantic Search for FAQs

Use vector search instead of keyword matching:

from openai import Embeddings

def get_faq_answer(question: str) -> dict:
    """Search FAQ using semantic search."""
    # Embed the question
    question_embedding = embeddings.create(input=question)
    
    # Search vector database
    results = vector_db.similarity_search(
        question_embedding,
        top_k=3
    )
    
    # Return best match
    return {
        "answer": results[0].content,
        "source": results[0].metadata["source"]
    }

5. Add Conversation History

Store and retrieve past conversations:

def get_conversation_history(user_id: str, limit: int = 10) -> list:
    """Get recent conversation history."""
    # Query database
    conversations = db.query(
        "SELECT * FROM conversations WHERE user_id = %s ORDER BY created_at DESC LIMIT %s",
        (user_id, limit)
    )
    return conversations

Exercises

Try these to deepen your understanding:

Exercise 1: Add a Tool

Create a new tool get_account_settings that returns user account preferences. Integrate it into the agent.

Exercise 2: Improve Guardrails

Add tone detection using sentiment analysis. Escalate if sentiment is very negative.

Exercise 3: Multi-Turn Conversations

Handle follow-up questions like “What about my billing?” when the previous message was about subscription.

Exercise 4: Error Recovery

If a tool call fails, have the agent try an alternative or ask the user for clarification.

Exercise 5: Confidence Scoring

Add confidence scores to answers. Escalate if confidence is too low.

Reflection Questions

Think about these as you build agents:

1. What Boundaries Should This Agent Have?

In a real company, what should the agent be allowed to do? What should require human approval?

Consider:

Financial transactions
Account modifications
Data access
Legal advice

2. What Tasks Are Better Left to Humans?

Not everything should be automated.

Examples:

Complex billing disputes
Emotional support
Creative problem-solving
Relationship building

3. How Do You Handle Edge Cases?

What happens when:

User asks something completely unrelated?
Tool returns an error?
Agent gets stuck in a loop?
User is abusive or threatening?

4. How Do You Measure Success?

Metrics to track:

Resolution rate (answered vs escalated)
User satisfaction
Average response time
Cost per conversation
Escalation rate

5. How Do You Improve Over Time?

Ways to improve:

Log all conversations
Review escalations
Update knowledge base
Refine guardrails
A/B test prompts

Real-World Considerations

Security

Input validation - Sanitize all user input
Authentication - Verify user identity
Authorization - Check permissions before tool calls
Rate limiting - Prevent abuse
Data privacy - Don’t log sensitive information

Reliability

Error handling - Graceful failures
Timeouts - Don’t wait forever
Retries - Handle transient failures
Monitoring - Track errors and performance
Fallbacks - What if LLM is down?

Cost Management

Token limits - Control max tokens per request
Caching - Cache common responses
Rate limiting - Prevent excessive API calls
Monitoring - Track API costs
Optimization - Reduce unnecessary tool calls

Knowledge Check

Test your understanding:

Knowledge Check

This interactive quiz requires JavaScript to be enabled.

Question 1: What is the main difference between a one-shot LLM call and an agent?

A. Agents are faster
B. Agents use a loop with tools, memory, and guardrails (Correct)
C. Agents are cheaper
D. There's no difference

Explanation: Agents use a loop that can call tools, maintain memory, and enforce guardrails, making them more capable than single API calls.

Question 2: Why is it important to keep agents narrow and focused?

A. They're easier to code
B. Narrow agents are safer, more predictable, and easier to debug (Correct)
C. They use less memory
D. They're required by law

Explanation: Narrow agents have clear boundaries, predictable behavior, and are easier to debug and maintain.

Question 3: What should an agent do when it encounters a billing dispute?

A. Try to resolve it automatically
B. Escalate to human support immediately (Correct)
C. Ignore it
D. Ask the user for more details

Explanation: Billing disputes are sensitive and should be escalated to human support immediately, not handled automatically.

Question 4: What are the four pillars of an AI agent?

A. Speed, accuracy, cost, reliability
B. Loop, tools, memory, guardrails (Correct)
C. Input, output, processing, storage
D. API, database, cache, queue

Explanation: The four pillars are: loop (the cycle), tools (functions), memory (context), and guardrails (safety rules).

Question 5: What is the purpose of the max_steps parameter?

A. To limit API costs
B. To prevent infinite loops and ensure the agent eventually responds (Correct)
C. To speed up responses
D. To improve accuracy