Feb 1, 2026

Intermediate 25 min

Why Debugging Matters

Agents are complex. They make decisions, call tools, and loop. When something goes wrong, you need to understand:

What did the agent decide to do?
Which tools did it call?
What results did it get?
Why did it make that choice?

Good debugging tools make this easy.

Debugging Technique 1: Print Tool Calls

The simplest approach: print everything.

🐍 Python Agent with Logging

📟 Console Output

Run code to see output...

Output example:

============================================================
AGENT START: Create a weekly digest from ./notes
============================================================

============================================================
ITERATION 1
============================================================

📝 Agent says:
   I'll create a plan to generate your weekly digest...

🔧 Tool calls: 1
   1. list_files({'directory': './notes'})

   ↳ Result: {'files': ['2026-01-20.md', '2026-01-21.md', '2026-01-22.md'], 'count': 3}

============================================================
ITERATION 2
============================================================
...

Debugging Technique 2: JSONL Logging

Save the full conversation to a file for later analysis:

🐍 Python JSONL Conversation Logger

📟 Console Output

Run code to see output...

Why JSONL?

Each line is a complete JSON object
Easy to parse and analyze
Can stream large logs without loading everything into memory
Standard format for ML training data

Debugging Technique 3: Replay Runs

Load a logged conversation and replay it:

🐍 Python Replay Agent Runs

📟 Console Output

Run code to see output...

Debugging Technique 4: Conversation Snapshots

Save the full conversation state at each iteration:

🐍 Python Conversation Snapshots

📟 Console Output

Run code to see output...

Common Issues and Solutions


              **Symptom:** Agent keeps calling the same tool repeatedly

**Causes:**
- Tool returns an error, agent retries indefinitely
- Tool result doesn't contain expected information
- Agent doesn't understand when it's done

**Solutions:**
1. Check tool implementations - make sure they return useful results
2. Add better error messages in tool outputs
3. Lower MAX_ITERS to catch loops faster
4. Add explicit "done" conditions in instructions

**Debug:**
```python
# Add this to see if agent is looping
if iteration > 0 and message.tool_calls:
  prev_call = input_list[-2].get("tool_calls", [])
  if prev_call and prev_call[0].function.name == message.tool_calls[0].function.name:
      print("⚠️  WARNING: Agent calling same tool again")
```


              **Symptom:** Tools raise exceptions or return errors

**Causes:**
- Invalid arguments from model
- Missing files or permissions
- Type mismatches

**Solutions:**
1. Validate tool arguments before execution
2. Add try/except in tool implementations
3. Return structured errors: `{"error": "..."}`
4. Improve tool descriptions so model sends correct args

**Debug:**
```python
def dispatch_tool(name: str, args: dict) -> dict:
  print(f"🔍 Dispatching: {name}")
  print(f"   Args: {args}")
  print(f"   Types: {[(k, type(v).__name__) for k, v in args.items()]}")
  
  try:
      result = tool_map[name](**args)
      print(f"   ✅ Success: {result}")
      return result
  except Exception as e:
      print(f"   ❌ Error: {e}")
      return {"error": str(e)}
```


              **Symptom:** Agent just talks about what it would do

**Causes:**
- Tool descriptions are unclear
- Model doesn't understand when to use tools
- Instructions don't encourage tool use

**Solutions:**
1. Improve tool descriptions - be specific about when to use each tool
2. Add examples in tool descriptions
3. Update instructions: "Use tools for all file operations"
4. Try a different model (some are better at tool calling)

**Debug:**
```python
# Check if tools are being sent to model
print(f"\n🔧 Available tools: {[t['name'] for t in tools]}")

# After model response
if not message.tool_calls:
  print("⚠️  No tool calls - agent chose not to use tools")
  print(f"   Response: {message.content[:200]}")
```


              **Symptom:** Agent doesn't handle denials well

**Causes:**
- Agent doesn't see the denial result
- Agent retries the same operation
- No guidance on what to do when denied

**Solutions:**
1. Return clear denial message: `{"status": "denied_by_user", "reason": "..."}`
2. Update instructions to handle denials gracefully
3. Log denials for analysis

**Debug:**
```python
def approve(tool_name: str, args: dict) -> bool:
  print(f"\n⚠️  Approval needed: {tool_name}")
  ans = input("Approve? [y/N] ").strip().lower()
  
  if ans != "y":
      print("❌ Denied - agent will see this in tool result")
      return False
  
  print("✅ Approved")
  return True
```

Testing Your Agent

Create a test suite to validate agent behavior:

🐍 Python Agent Test Suite

📟 Console Output

Run code to see output...

Debugging Checklist

When your agent misbehaves, check:

Tool implementations - Do they return the right format?
Tool schemas - Are descriptions clear?
Instructions - Do they guide the agent properly?
MAX_ITERS - Is it too high or too low?
Budget - Is the agent hitting limits?
Logs - What does the conversation history show?
Dry-run - Does it work in simulation?

Key Takeaways

You now know how to:

✅ Print tool calls for quick debugging
✅ Log to JSONL for detailed analysis
✅ Replay runs to understand past behavior
✅ Save snapshots of conversation state
✅ Troubleshoot common agent issues
✅ Test agent components

What’s Next?

In the final page, we’ll wrap up with exercises and next steps: how to extend your agent, scale to production, and explore advanced patterns.

Progress 86%

Page 6 of 7

← Previous → Next

Sign In