Why Debugging Matters
Agents are complex. They make decisions, call tools, and loop. When something goes wrong, you need to understand:
- What did the agent decide to do?
- Which tools did it call?
- What results did it get?
- Why did it make that choice?
Good debugging tools make this easy.
Debugging Technique 1: Print Tool Calls
The simplest approach: print everything.
🐍 Python Agent with Logging
📟 Console Output
Run code to see output...
Output example:
============================================================
AGENT START: Create a weekly digest from ./notes
============================================================
============================================================
ITERATION 1
============================================================
📝 Agent says:
I'll create a plan to generate your weekly digest...
🔧 Tool calls: 1
1. list_files({'directory': './notes'})
↳ Result: {'files': ['2026-01-20.md', '2026-01-21.md', '2026-01-22.md'], 'count': 3}
============================================================
ITERATION 2
============================================================
...
Debugging Technique 2: JSONL Logging
Save the full conversation to a file for later analysis:
🐍 Python JSONL Conversation Logger
📟 Console Output
Run code to see output...
Why JSONL?
- Each line is a complete JSON object
- Easy to parse and analyze
- Can stream large logs without loading everything into memory
- Standard format for ML training data
Debugging Technique 3: Replay Runs
Load a logged conversation and replay it:
🐍 Python Replay Agent Runs
📟 Console Output
Run code to see output...
Debugging Technique 4: Conversation Snapshots
Save the full conversation state at each iteration:
🐍 Python Conversation Snapshots
📟 Console Output
Run code to see output...
Common Issues and Solutions
**Symptom:** Agent keeps calling the same tool repeatedly
**Causes:**
- Tool returns an error, agent retries indefinitely
- Tool result doesn't contain expected information
- Agent doesn't understand when it's done
**Solutions:**
1. Check tool implementations - make sure they return useful results
2. Add better error messages in tool outputs
3. Lower MAX_ITERS to catch loops faster
4. Add explicit "done" conditions in instructions
**Debug:**
```python
# Add this to see if agent is looping
if iteration > 0 and message.tool_calls:
prev_call = input_list[-2].get("tool_calls", [])
if prev_call and prev_call[0].function.name == message.tool_calls[0].function.name:
print("⚠️ WARNING: Agent calling same tool again")
```
**Symptom:** Tools raise exceptions or return errors
**Causes:**
- Invalid arguments from model
- Missing files or permissions
- Type mismatches
**Solutions:**
1. Validate tool arguments before execution
2. Add try/except in tool implementations
3. Return structured errors: `{"error": "..."}`
4. Improve tool descriptions so model sends correct args
**Debug:**
```python
def dispatch_tool(name: str, args: dict) -> dict:
print(f"🔍 Dispatching: {name}")
print(f" Args: {args}")
print(f" Types: {[(k, type(v).__name__) for k, v in args.items()]}")
try:
result = tool_map[name](**args)
print(f" ✅ Success: {result}")
return result
except Exception as e:
print(f" ❌ Error: {e}")
return {"error": str(e)}
```
**Symptom:** Agent just talks about what it would do
**Causes:**
- Tool descriptions are unclear
- Model doesn't understand when to use tools
- Instructions don't encourage tool use
**Solutions:**
1. Improve tool descriptions - be specific about when to use each tool
2. Add examples in tool descriptions
3. Update instructions: "Use tools for all file operations"
4. Try a different model (some are better at tool calling)
**Debug:**
```python
# Check if tools are being sent to model
print(f"\n🔧 Available tools: {[t['name'] for t in tools]}")
# After model response
if not message.tool_calls:
print("⚠️ No tool calls - agent chose not to use tools")
print(f" Response: {message.content[:200]}")
```
**Symptom:** Agent doesn't handle denials well
**Causes:**
- Agent doesn't see the denial result
- Agent retries the same operation
- No guidance on what to do when denied
**Solutions:**
1. Return clear denial message: `{"status": "denied_by_user", "reason": "..."}`
2. Update instructions to handle denials gracefully
3. Log denials for analysis
**Debug:**
```python
def approve(tool_name: str, args: dict) -> bool:
print(f"\n⚠️ Approval needed: {tool_name}")
ans = input("Approve? [y/N] ").strip().lower()
if ans != "y":
print("❌ Denied - agent will see this in tool result")
return False
print("✅ Approved")
return True
```
Testing Your Agent
Create a test suite to validate agent behavior:
🐍 Python Agent Test Suite
📟 Console Output
Run code to see output...
Debugging Checklist
When your agent misbehaves, check:
- Tool implementations - Do they return the right format?
- Tool schemas - Are descriptions clear?
- Instructions - Do they guide the agent properly?
- MAX_ITERS - Is it too high or too low?
- Budget - Is the agent hitting limits?
- Logs - What does the conversation history show?
- Dry-run - Does it work in simulation?
Key Takeaways
You now know how to:
- ✅ Print tool calls for quick debugging
- ✅ Log to JSONL for detailed analysis
- ✅ Replay runs to understand past behavior
- ✅ Save snapshots of conversation state
- ✅ Troubleshoot common agent issues
- ✅ Test agent components
What’s Next?
In the final page, we’ll wrap up with exercises and next steps: how to extend your agent, scale to production, and explore advanced patterns.
Progress 86%
Page 6 of 7
← Previous
→ Next