Feb 1, 2026

Intermediate 25 min

You Built an Agent!

Congratulations! You’ve built a complete plan-and-execute agent with:

✅ Tool calling loop
✅ Approval gates for safety
✅ Budget and resource controls
✅ Debugging and logging
✅ Dry-run mode for testing

This is the foundation of production agentic systems.

Exercises

Try these exercises to deepen your understanding:

Exercise 1: Add search_notes Tool

Goal: Add a tool that searches for keywords in notes.

Requirements:

Tool name: search_notes
Parameters: keyword (string)
Returns: List of files containing the keyword with matching lines
Safety: Read-only (no approval needed)

Starter code:

🐍 Python Exercise 1: search_notes Tool

📟 Console Output

Run code to see output...

Success criteria:

Agent uses search_notes before reading all files
Returns accurate matches with line numbers
Handles case-insensitive search

Exercise 2: Add Dry-Run Flag

Goal: Add a command-line flag to enable dry-run mode.

Requirements:

Accept --dry-run flag when running agent
Print clear indication that dry-run is active
No real file writes or sends in dry-run mode

Starter code:

🐍 Python Exercise 2: CLI with Dry-Run

📟 Console Output

Run code to see output...

Success criteria:

--dry-run flag works correctly
Clear indication when dry-run is active
Help text explains all options

Exercise 3: Add Policy Gate

Goal: Block send_message if text contains secrets.

Requirements:

Check message text for patterns like API keys, tokens, passwords
Deny automatically if secrets detected
Explain why it was denied

Starter code:

🐍 Python Exercise 3: Policy Gate

📟 Console Output

Run code to see output...

Success criteria:

Detects API keys and tokens in messages
Denies automatically without user prompt
Returns clear explanation of violation

Final Knowledge Check

Test your understanding of the complete agent system:

Knowledge Check

This interactive quiz requires JavaScript to be enabled.

Question 1: What is the main advantage of the plan-and-execute pattern over reactive agents?

A. It's faster
B. It uses less memory
C. It creates a roadmap before acting, making it more goal-oriented (Correct)
D. It requires fewer tool calls

Explanation: Plan-and-execute agents create a plan first, giving them a clear roadmap. This makes them more efficient and less likely to get stuck in loops compared to reactive agents that respond to each observation without a plan.

Question 2: Why do we maintain conversation state (input_list) across iterations?

A. To save memory
B. So the model can see previous tool calls and results (Correct)
C. To make the code simpler
D. To reduce API costs

Explanation: The input_list grows with each iteration, containing the full conversation history including tool calls and results. This allows the model to make informed decisions based on what happened in previous steps.

Question 3: When should a tool require approval?

A. When it's slow to execute
B. When it has side effects (writes, sends, deletes) (Correct)
C. When it reads sensitive data
D. When it's called more than once

Explanation: Tools with side effects (operations that change state or communicate externally) should require approval because they can cause real-world consequences. Read-only operations are generally safe to run automatically.

Question 4: What is the purpose of MAX_ITERS?

A. To make the agent faster
B. To prevent infinite loops and runaway costs (Correct)
C. To improve response quality
D. To reduce memory usage

Explanation: MAX_ITERS is a safety limit that prevents the agent from looping forever. Without it, a buggy agent could make unlimited API calls, costing money and never completing.

Question 5: What does dry-run mode allow you to do?

A. Run the agent faster
B. Test agent logic without making real changes (Correct)
C. Skip approval gates
D. Use fewer API calls

Explanation: Dry-run mode simulates tool execution without side effects. This lets you test agent behavior safely—you can see what it would do without actually writing files or sending messages.

Question 6: Why log conversations to JSONL format?

A. It's faster than other formats
B. Each line is a complete JSON object, making it easy to parse and stream (Correct)
C. It uses less disk space
D. It's required by OpenAI

Explanation: JSONL (JSON Lines) format stores one JSON object per line. This makes it easy to parse large logs without loading everything into memory, and it's a standard format for ML training data.

Question 7: What should a tool function return when it encounters an error?

A. Raise an exception
B. Return None
C. Return a dict with an 'error' key (Correct)
D. Exit the program

Explanation: Tools should return a dict with an 'error' key (e.g., {'error': 'File not found'}). This allows the agent to see what went wrong and potentially recover, rather than crashing the entire system.

Question 8: What is the role of tool schemas in the agent system?

A. They make tools run faster
B. They tell the model what tools exist and how to use them (Correct)
C. They validate tool outputs
D. They reduce API costs

Explanation: Tool schemas (in JSON Schema format) describe each tool's purpose, parameters, and types. The model uses these schemas to understand which tools are available and how to call them correctly.

Question 9: Why is resource tracking important in production agents?

A. It makes agents smarter
B. It reduces API costs
C. It helps monitor behavior, debug issues, and identify optimization opportunities (Correct)
D. It's required by OpenAI

Explanation: Resource tracking monitors what the agent does: which tools it calls, which files it touches, how long it runs. This data helps you debug issues, optimize performance, and understand agent behavior in production.

Question 10: What is the correct order of the agent loop?

A. Execute tools → Call model → Add results → Repeat
B. Call model → Check for tool calls → Execute tools → Add results → Repeat (Correct)
C. Add results → Execute tools → Call model → Repeat
D. Check for tool calls → Add results → Call model → Execute tools

Explanation: The agent loop follows this order: 1) Call model with conversation history, 2) Check if model requested tool calls, 3) Execute any tool calls, 4) Add results to conversation, 5) Repeat until done or hit max iterations.

What You Can Do Now

Your agent can:

✅ Plan and execute multi-step tasks
✅ Call tools to interact with the world
✅ Ask for approval before risky operations
✅ Handle errors gracefully
✅ Stay within budgets to prevent runaway costs
✅ Run in dry-run mode for safe testing
✅ Log everything for debugging

What Your Agent Cannot Do

Be realistic about limitations:

❌ Guarantee correctness - LLMs can still make mistakes
❌ Handle all edge cases - You need to add validation
❌ Scale infinitely - You need rate limits and quotas
❌ Understand complex context - Keep tasks focused
❌ Replace human judgment - Approval gates are essential

Scaling to Production

From Demo to Production

This animated concept requires JavaScript to be enabled.

Frames:

Demo Agent: Local CLI, mocked tools, manual approval. Good for learning and testing.
Production Agent: API service, real integrations, policy engine, monitoring, multi-user. Ready for real workloads.

Moving to Production

When you’re ready to deploy, consider:

1. Architecture

Move from CLI to API service
Add authentication and authorization
Support multiple concurrent users
Use async/await for better performance

2. Tool Integration

Replace mocked tools with real APIs
Add retry logic and error handling
Implement rate limiting per tool
Cache expensive operations

3. Safety and Compliance

Replace manual approval with policy engine
Log all actions for audit
Add role-based access control
Implement data privacy controls

4. Monitoring

Track success/failure rates
Monitor costs per user/session
Alert on anomalies
Measure latency and throughput

5. Scaling

Use message queues for async processing
Add load balancing
Implement circuit breakers
Cache model responses where appropriate

Advanced Patterns

Once you master the basics, explore:

1. Multi-Agent Systems

Specialized agents for different tasks
Agents that delegate to other agents
Consensus mechanisms for decisions

2. Graph-Based Agents

Use LangGraph or similar for complex flows
Define explicit state machines
Handle branching and loops explicitly

3. Memory Systems

Long-term memory across sessions
Vector databases for semantic search
Conversation summarization

4. Advanced Tool Calling

Parallel tool execution
Tool chaining and composition
Dynamic tool generation

5. Evaluation

Automated testing of agent behavior
Benchmark suites for common tasks
A/B testing different prompts

Recommended Resources

OpenAI Documentation

Function Calling Guide
Agents SDK (for production systems)

Frameworks

LangChain - Agent framework with many integrations
LangGraph - Graph-based agent orchestration
AutoGPT - Autonomous agent examples

Papers

“ReAct: Synergizing Reasoning and Acting in Language Models”
“Toolformer: Language Models Can Teach Themselves to Use Tools”
“Chain-of-Thought Prompting Elicits Reasoning in Large Language Models”

Next Steps

Here’s what to do next:

Complete the exercises above
Extend your agent with new tools for your use case
Test in dry-run mode before running for real
Add monitoring to understand agent behavior
Share your agent and get feedback

Final Thoughts

You’ve learned the fundamentals of agentic AI:

Plan-and-execute pattern - The foundation of goal-oriented agents
Tool calling - How agents interact with the world
Safety controls - Budgets, approval gates, dry-run mode
Debugging - Logging, tracing, and troubleshooting

These patterns scale from demos to production systems. The agent you built is simple, but it contains all the core concepts you need.

The key insight: Agents are loops. They plan, act, observe, and repeat. Everything else is details.

Now go build something amazing! 🚀

Complete Code Repository

The full working code for this tutorial is available in the GitHub repository:

git clone https://github.com/appropri8/plan-execute-agent-tutorial.git
cd plan-execute-agent-tutorial

Or check it out in the githubRepo/ directory of this project.

Thank You!

Thanks for completing this tutorial. If you found it helpful, consider:

Sharing it with others learning about agents
Building on these concepts for your projects
Contributing improvements to the code

Happy building! 🎉

Progress 100%

Page 7 of 7

← Previous → Next

Sign In

From Demo to Production

Frames: