Hard Boundaries for AI Agents: Time, Budget, and Permission Controls in Production
You build an AI agent. It works in testing. You deploy it. Then it runs for hours. Or it calls expensive APIs thousands of times. Or it accesses data it shouldn’t.
Agents are good at chaining actions. They’re bad at knowing when to stop.
This article shows how to add hard boundaries that make runaway behavior impossible. Time limits. Token budgets. Tool permissions. Data scoping. These aren’t optional features. They’re production requirements.
Why “Hard Boundaries” Matter for Agents
Agents make decisions. They choose tools. They call APIs. They access data. Without limits, they can:
- Run forever in loops
- Spend unlimited money on API calls
- Access tools or data they don’t need
- Expose user data to models that shouldn’t see it
These aren’t edge cases. They happen in production.
Real Risks
Infinite loops:
An agent tries to solve a problem. It calls a tool. The tool returns an error. The agent tries again. Same error. It keeps trying. The request runs for hours. Your server locks up.
Cost blow-ups:
An agent needs to search a database. It doesn’t find results. It searches again with different parameters. And again. And again. Each search costs money. One request can cost hundreds of dollars.
Over-permissioned tools:
An agent has access to all your tools. It only needs to read data. But it can also delete data. It makes a mistake. It deletes something important. You can’t undo it.
Data leaks:
An agent processes user requests. It sends full user data to the model. The model shouldn’t see PII. But it does. You’ve exposed sensitive information.
The Goal
Make runaway behavior impossible by design. Not “unlikely.” Not “usually prevented.” Impossible.
Hard boundaries enforce limits at the system level. They don’t rely on the agent being smart. They don’t rely on prompts being perfect. They enforce limits in code.
If an agent tries to exceed a limit, the system stops it. Not the agent. The system.
Define Your Agent’s Contract
Before you add boundaries, define what your agent does. Turn vague “autonomy” into a clear contract.
The Contract
Input: What the agent receives. User query. Context. Parameters.
Allowed tools: What the agent can touch. Not “all tools.” Specific tools for this agent.
Outputs: What the agent must return. Final answer. Structured data. Error message.
Limits: Time. Steps. Tokens. Budget.
Example Contract
from typing import List, Dict, Any
from dataclasses import dataclass
@dataclass
class AgentContract:
"""Defines what an agent can do"""
name: str
allowed_tools: List[str]
max_runtime_seconds: int
max_steps: int
max_tokens: int
max_cost_dollars: float
required_output: str # "text", "json", "structured"
def validate_tool(self, tool_name: str) -> bool:
"""Check if tool is allowed"""
return tool_name in self.allowed_tools
Why Contracts Matter
Contracts make limits visible. They’re in code. They’re in logs. They’re in documentation.
When an agent fails, you know why. It exceeded a limit. The limit is defined. You can adjust it.
Without contracts, limits are hidden. They’re in prompts. They’re in comments. They’re forgotten.
Making Contracts Visible
Put contracts in code:
# Agent contract for support bot
SUPPORT_BOT_CONTRACT = AgentContract(
name="support_bot",
allowed_tools=["search_kb", "create_ticket", "get_user_info"],
max_runtime_seconds=30,
max_steps=10,
max_tokens=10000,
max_cost_dollars=0.50,
required_output="text"
)
Log contracts:
def log_contract(contract: AgentContract, request_id: str):
"""Log the agent contract for this request"""
log_entry = {
"request_id": request_id,
"agent_name": contract.name,
"allowed_tools": contract.allowed_tools,
"limits": {
"max_runtime_seconds": contract.max_runtime_seconds,
"max_steps": contract.max_steps,
"max_tokens": contract.max_tokens,
"max_cost_dollars": contract.max_cost_dollars
}
}
print(f"CONTRACT: {json.dumps(log_entry)}")
When something goes wrong, you can see what limits were set. You can see if they were too strict or too loose.
Time and Step Limits
Agents can run forever. They can call tools in loops. They can retry failed operations indefinitely.
Time limits stop this. Step limits stop this.
Max Wall-Clock Time
Set a maximum runtime per request. If the agent runs longer, kill it.
import signal
import time
from contextlib import contextmanager
class TimeoutError(Exception):
pass
@contextmanager
def timeout(seconds: int):
"""Kill operation after timeout"""
def timeout_handler(signum, frame):
raise TimeoutError(f"Operation exceeded {seconds} seconds")
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(seconds)
try:
yield
finally:
signal.alarm(0)
# Usage
try:
with timeout(30):
result = run_agent(user_input)
except TimeoutError as e:
result = {"error": str(e), "partial": True}
Max Planning Depth
Limit how many tool calls an agent can make. After N calls, force it to return.
class StepBudget:
"""Track and enforce step limits"""
def __init__(self, max_steps: int):
self.max_steps = max_steps
self.steps_taken = 0
def check(self) -> bool:
"""Check if more steps are allowed"""
return self.steps_taken < self.max_steps
def use_step(self):
"""Record a step"""
self.steps_taken += 1
if not self.check():
raise StepLimitExceeded(f"Exceeded {self.max_steps} steps")
# In agent loop
budget = StepBudget(max_steps=10)
while budget.check():
tool_call = agent.choose_tool()
budget.use_step()
result = call_tool(tool_call)
agent.update(result)
if agent.is_done():
break
if not budget.check():
return {"error": "Step limit exceeded", "partial": True}
Per-Tool Timeouts
Some tools are slow. Set timeouts per tool.
def call_tool_with_timeout(tool_name: str, params: dict, timeout_seconds: int):
"""Call tool with timeout"""
try:
with timeout(timeout_seconds):
return call_tool(tool_name, params)
except TimeoutError:
return {"error": f"Tool {tool_name} timed out after {timeout_seconds}s"}
Failing Gracefully
When limits are hit, return partial results. Don’t crash.
def run_agent_with_limits(user_input: str, contract: AgentContract):
"""Run agent with all limits enforced"""
start_time = time.time()
step_budget = StepBudget(contract.max_steps)
results = []
try:
with timeout(contract.max_runtime_seconds):
while step_budget.check():
tool_call = agent.choose_tool()
# Check tool permission
if not contract.validate_tool(tool_call.name):
return {
"error": f"Tool {tool_call.name} not allowed",
"partial": True,
"results": results
}
step_budget.use_step()
# Call tool with timeout
tool_result = call_tool_with_timeout(
tool_call.name,
tool_call.params,
timeout_seconds=5
)
results.append(tool_result)
agent.update(tool_result)
if agent.is_done():
break
return {"results": results, "complete": True}
except TimeoutError:
return {
"error": "Runtime limit exceeded",
"partial": True,
"results": results
}
except StepLimitExceeded:
return {
"error": "Step limit exceeded",
"partial": True,
"results": results
}
Partial results are better than nothing. Users can see what was done. They can retry if needed.
Token and Cost Budgets
LLM calls cost money. Agents can make many calls. One request can cost hundreds of dollars.
Track tokens. Track costs. Enforce budgets.
What to Track
Prompt tokens: Input to the model. User query. System prompt. Context.
Response tokens: Output from the model. Generated text.
Tool call tokens: When tools call LLMs. Nested calls add up.
Simple Token Counter
import tiktoken
class TokenBudget:
"""Track and enforce token limits"""
def __init__(self, max_tokens: int):
self.max_tokens = max_tokens
self.tokens_used = 0
self.encoding = tiktoken.encoding_for_model("gpt-4")
def count_tokens(self, text: str) -> int:
"""Count tokens in text"""
return len(self.encoding.encode(text))
def add_tokens(self, tokens: int):
"""Add tokens to budget"""
self.tokens_used += tokens
if self.tokens_used > self.max_tokens:
raise TokenBudgetExceeded(
f"Exceeded {self.max_tokens} tokens (used {self.tokens_used})"
)
def check(self, text: str) -> bool:
"""Check if text would exceed budget"""
tokens_needed = self.count_tokens(text)
return (self.tokens_used + tokens_needed) <= self.max_tokens
def remaining(self) -> int:
"""Get remaining tokens"""
return max(0, self.max_tokens - self.tokens_used)
Cost Tracking
# Pricing per 1K tokens (example)
PRICING = {
"gpt-4": {"input": 0.03, "output": 0.06},
"gpt-3.5-turbo": {"input": 0.0015, "output": 0.002}
}
class CostBudget:
"""Track and enforce cost limits"""
def __init__(self, max_cost_dollars: float):
self.max_cost_dollars = max_cost_dollars
self.cost_used = 0.0
def add_cost(self, model: str, input_tokens: int, output_tokens: int):
"""Add cost to budget"""
if model not in PRICING:
raise ValueError(f"Unknown model: {model}")
input_cost = (input_tokens / 1000) * PRICING[model]["input"]
output_cost = (output_tokens / 1000) * PRICING[model]["output"]
total_cost = input_cost + output_cost
self.cost_used += total_cost
if self.cost_used > self.max_cost_dollars:
raise CostBudgetExceeded(
f"Exceeded ${self.max_cost_dollars} budget (used ${self.cost_used:.2f})"
)
def remaining(self) -> float:
"""Get remaining budget"""
return max(0, self.max_cost_dollars - self.cost_used)
Hard Caps vs Soft Caps
Hard caps: Refuse to add more context or call more tools. Stop immediately.
def call_llm_with_budget(prompt: str, token_budget: TokenBudget, cost_budget: CostBudget):
"""Call LLM if budget allows"""
tokens_needed = token_budget.count_tokens(prompt)
if not token_budget.check(prompt):
raise TokenBudgetExceeded("Not enough tokens remaining")
# Make call
response = openai_client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
input_tokens = response.usage.prompt_tokens
output_tokens = response.usage.completion_tokens
# Update budgets
token_budget.add_tokens(input_tokens + output_tokens)
cost_budget.add_cost("gpt-4", input_tokens, output_tokens)
return response
Soft caps: Switch to cheaper models or shorter responses. Continue with constraints.
def call_llm_with_soft_cap(prompt: str, token_budget: TokenBudget, cost_budget: CostBudget):
"""Call LLM, downgrade if needed"""
# Try expensive model first
if cost_budget.remaining() > 1.0:
model = "gpt-4"
else:
model = "gpt-3.5-turbo" # Cheaper fallback
# Truncate prompt if needed
max_tokens = token_budget.remaining()
if token_budget.count_tokens(prompt) > max_tokens:
# Truncate to fit
prompt = truncate_to_tokens(prompt, max_tokens - 100) # Leave room for response
response = openai_client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
max_tokens=min(500, token_budget.remaining()) # Limit response length
)
input_tokens = response.usage.prompt_tokens
output_tokens = response.usage.completion_tokens
token_budget.add_tokens(input_tokens + output_tokens)
cost_budget.add_cost(model, input_tokens, output_tokens)
return response
Predictable Costs
Design workflows with predictable costs. “This workflow costs between $0.10 and $0.50.”
def estimate_workflow_cost(workflow_steps: list) -> dict:
"""Estimate cost range for workflow"""
min_cost = 0.0
max_cost = 0.0
for step in workflow_steps:
# Estimate based on step type
if step["type"] == "llm_call":
min_cost += 0.01 # Minimal prompt
max_cost += 0.10 # Large context
elif step["type"] == "tool_call":
min_cost += 0.0 # Free tool
max_cost += 0.05 # Expensive API
return {
"min_cost": min_cost,
"max_cost": max_cost,
"estimated_cost": (min_cost + max_cost) / 2
}
Users can see cost estimates before running. They can set budgets accordingly.
Tool and Permission Scoping
Agents shouldn’t see tools they don’t need. A read-only agent shouldn’t have delete permissions. A user-facing agent shouldn’t have admin tools.
Principle: Minimum Required Access
Give agents the minimum tools they need. Nothing more.
Role-Based Tool Sets
Different roles get different tools.
TOOL_PERMISSIONS = {
"user": ["search_kb", "create_ticket", "get_user_info"],
"admin": ["search_kb", "create_ticket", "get_user_info", "delete_ticket", "modify_user"],
"readonly": ["search_kb", "get_user_info"]
}
def get_tools_for_role(role: str) -> List[str]:
"""Get allowed tools for role"""
return TOOL_PERMISSIONS.get(role, [])
Environment-Based Tool Sets
Different environments get different tools. Production has restrictions. Development has more access.
ENV_TOOLS = {
"production": ["search_kb", "create_ticket"],
"staging": ["search_kb", "create_ticket", "delete_ticket"],
"development": ["search_kb", "create_ticket", "delete_ticket", "reset_database"]
}
def get_tools_for_env(env: str) -> List[str]:
"""Get allowed tools for environment"""
return ENV_TOOLS.get(env, [])
Data-Scoped Tools
Tools are scoped to data. Per-tenant. Per-user.
def get_tools_for_context(user_id: str, tenant_id: str) -> List[str]:
"""Get tools scoped to user and tenant"""
base_tools = ["search_kb", "create_ticket"]
# Add tenant-specific tools
if tenant_id == "enterprise":
base_tools.append("advanced_analytics")
# Add user-specific tools
if is_admin(user_id):
base_tools.append("delete_ticket")
return base_tools
Tool Capability Matrix
Document who can call what and when.
TOOL_MATRIX = {
"search_kb": {
"roles": ["user", "admin", "readonly"],
"environments": ["production", "staging", "development"],
"rate_limit": "100/hour"
},
"delete_ticket": {
"roles": ["admin"],
"environments": ["staging", "development"],
"rate_limit": "10/hour",
"requires_approval": True
},
"reset_database": {
"roles": ["admin"],
"environments": ["development"],
"rate_limit": "1/day",
"requires_approval": True
}
}
def can_call_tool(tool_name: str, role: str, env: str) -> bool:
"""Check if tool can be called"""
if tool_name not in TOOL_MATRIX:
return False
tool = TOOL_MATRIX[tool_name]
return role in tool["roles"] and env in tool["environments"]
Logging Tool Calls
Log every tool call with context. User. Tenant. Role. Time.
def log_tool_call(tool_name: str, params: dict, user_context: dict):
"""Log tool call for audit"""
log_entry = {
"timestamp": datetime.utcnow().isoformat(),
"tool": tool_name,
"params": sanitize_params(params), # Remove sensitive data
"user_id": user_context.get("user_id"),
"tenant_id": user_context.get("tenant_id"),
"role": user_context.get("role"),
"environment": user_context.get("environment")
}
# Store in audit log
audit_log.append(log_entry)
# Alert on suspicious patterns
if is_suspicious_pattern(tool_name, user_context):
alert_security_team(log_entry)
This creates an audit trail. You can see who called what. You can detect abuse.
Data Boundaries and PII Handling
Agents process user data. They shouldn’t see everything. They shouldn’t keep everything.
Limit What Reaches the Agent
Only send data the agent needs. Not full user profiles. Not complete histories.
def prepare_agent_context(user_id: str, task: str) -> dict:
"""Prepare minimal context for agent"""
# Get only what's needed
if task == "support":
return {
"user_id": user_id,
"recent_tickets": get_recent_tickets(user_id, limit=5),
"account_tier": get_account_tier(user_id)
}
elif task == "billing":
return {
"user_id": user_id,
"current_plan": get_current_plan(user_id),
"billing_history": get_billing_history(user_id, limit=3)
}
else:
return {"user_id": user_id} # Minimal context
Redaction and Masking
Remove PII before sending to the model.
import re
def redact_pii(text: str) -> str:
"""Remove PII from text"""
# Email
text = re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', '[EMAIL]', text)
# Phone
text = re.sub(r'\b\d{3}-\d{3}-\d{4}\b', '[PHONE]', text)
# Credit card (simple pattern)
text = re.sub(r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b', '[CARD]', text)
# SSN
text = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN]', text)
return text
def prepare_safe_input(user_input: str, context: dict) -> str:
"""Prepare input with PII removed"""
# Redact user input
safe_input = redact_pii(user_input)
# Redact context
safe_context = {}
for key, value in context.items():
if isinstance(value, str):
safe_context[key] = redact_pii(value)
else:
safe_context[key] = value
return f"User: {safe_input}\nContext: {json.dumps(safe_context)}"
Separate Identity from Task Context
Keep identity separate. Don’t send it to the model unless needed.
class AgentRequest:
"""Request with separated identity and context"""
def __init__(self, user_id: str, task_context: dict):
self.user_id = user_id # Kept separate
self.task_context = task_context # Sent to model
def to_prompt(self) -> str:
"""Convert to prompt without identity"""
# Don't include user_id in prompt
return json.dumps(self.task_context)
def get_identity(self) -> str:
"""Get identity separately"""
return self.user_id
Retention Rules
Set how long you keep agent logs. Delete old data.
def cleanup_old_logs(retention_days: int = 30):
"""Delete logs older than retention period"""
cutoff_date = datetime.utcnow() - timedelta(days=retention_days)
# Delete from database
delete_query = """
DELETE FROM agent_logs
WHERE timestamp < %s
"""
execute_query(delete_query, (cutoff_date,))
# Delete from file storage
delete_old_files(cutoff_date)
Automate this. Run it daily. Don’t keep data forever.
Putting It Together: A Bounded Agent Wrapper
Combine all boundaries in one place. A wrapper that enforces limits. A wrapper that logs everything.
The BoundedAgent Class
from typing import Optional, Dict, Any
import time
import json
class BoundedAgent:
"""Agent with all boundaries enforced"""
def __init__(
self,
agent_core, # Your agent implementation
contract: AgentContract,
user_context: dict
):
self.agent_core = agent_core
self.contract = contract
self.user_context = user_context
# Initialize budgets
self.step_budget = StepBudget(contract.max_steps)
self.token_budget = TokenBudget(contract.max_tokens)
self.cost_budget = CostBudget(contract.max_cost_dollars)
self.start_time = time.time()
# Filter tools by permissions
self.available_tools = self._filter_tools()
# Log contract
self._log_contract()
def _filter_tools(self) -> List[str]:
"""Filter tools based on permissions"""
role = self.user_context.get("role", "user")
env = self.user_context.get("environment", "production")
# Get tools for role and environment
role_tools = get_tools_for_role(role)
env_tools = get_tools_for_env(env)
# Intersection: must be in both
allowed = set(role_tools) & set(env_tools)
# Also check contract
allowed = allowed & set(self.contract.allowed_tools)
return list(allowed)
def _log_contract(self):
"""Log the contract for this request"""
request_id = self.user_context.get("request_id", "unknown")
log_contract(self.contract, request_id)
def _check_timeout(self):
"""Check if runtime limit exceeded"""
elapsed = time.time() - self.start_time
if elapsed > self.contract.max_runtime_seconds:
raise TimeoutError(f"Runtime limit exceeded: {elapsed:.2f}s")
def _check_budgets(self):
"""Check if any budget exceeded"""
if not self.step_budget.check():
raise StepLimitExceeded("Step limit exceeded")
if self.token_budget.remaining() < 100:
raise TokenBudgetExceeded("Token budget nearly exhausted")
if self.cost_budget.remaining() < 0.01:
raise CostBudgetExceeded("Cost budget nearly exhausted")
def _log_tool_call(self, tool_name: str, params: dict, result: dict):
"""Log tool call for audit"""
log_tool_call(tool_name, params, self.user_context)
def run(self, user_input: str) -> Dict[str, Any]:
"""Run agent with all boundaries enforced"""
request_id = self.user_context.get("request_id", f"req_{int(time.time())}")
try:
# Prepare safe input
safe_input = prepare_safe_input(user_input, self.user_context.get("context", {}))
# Run agent loop
results = []
while True:
# Check limits
self._check_timeout()
self._check_budgets()
if not self.step_budget.check():
break
# Agent chooses tool
tool_choice = self.agent_core.choose_tool(safe_input, self.available_tools)
if tool_choice is None:
# Agent is done
break
tool_name = tool_choice["name"]
tool_params = tool_choice["params"]
# Validate tool permission
if tool_name not in self.available_tools:
return {
"error": f"Tool {tool_name} not allowed",
"request_id": request_id,
"partial": True,
"results": results
}
# Use step
self.step_budget.use_step()
# Call tool with timeout
try:
tool_result = call_tool_with_timeout(tool_name, tool_params, timeout_seconds=5)
except TimeoutError:
tool_result = {"error": f"Tool {tool_name} timed out"}
# Log tool call
self._log_tool_call(tool_name, tool_params, tool_result)
results.append({
"tool": tool_name,
"result": tool_result
})
# Update agent
self.agent_core.update(tool_result)
# Check if done
if self.agent_core.is_done():
break
# Generate final response
final_response = self.agent_core.generate_response()
# Log final response
self._log_response(request_id, final_response, results)
return {
"response": final_response,
"request_id": request_id,
"complete": True,
"results": results,
"budgets": {
"steps_used": self.step_budget.steps_taken,
"tokens_used": self.token_budget.tokens_used,
"cost_used": self.cost_budget.cost_used
}
}
except TimeoutError as e:
return self._handle_error("timeout", str(e), results, request_id)
except StepLimitExceeded as e:
return self._handle_error("step_limit", str(e), results, request_id)
except TokenBudgetExceeded as e:
return self._handle_error("token_budget", str(e), results, request_id)
except CostBudgetExceeded as e:
return self._handle_error("cost_budget", str(e), results, request_id)
except Exception as e:
return self._handle_error("unknown", str(e), results, request_id)
def _handle_error(self, error_type: str, error_msg: str, results: list, request_id: str):
"""Handle errors gracefully"""
self._log_error(request_id, error_type, error_msg)
return {
"error": error_msg,
"error_type": error_type,
"request_id": request_id,
"partial": True,
"results": results,
"budgets": {
"steps_used": self.step_budget.steps_taken,
"tokens_used": self.token_budget.tokens_used,
"cost_used": self.cost_budget.cost_used
}
}
def _log_response(self, request_id: str, response: str, results: list):
"""Log final response"""
log_entry = {
"request_id": request_id,
"response": response,
"results_count": len(results),
"budgets": {
"steps_used": self.step_budget.steps_taken,
"tokens_used": self.token_budget.tokens_used,
"cost_used": self.cost_budget.cost_used
},
"timestamp": datetime.utcnow().isoformat()
}
print(f"RESPONSE: {json.dumps(log_entry)}")
def _log_error(self, request_id: str, error_type: str, error_msg: str):
"""Log error"""
log_entry = {
"request_id": request_id,
"error_type": error_type,
"error": error_msg,
"timestamp": datetime.utcnow().isoformat()
}
print(f"ERROR: {json.dumps(log_entry)}")
Usage Example
# Define contract
contract = AgentContract(
name="support_bot",
allowed_tools=["search_kb", "create_ticket"],
max_runtime_seconds=30,
max_steps=10,
max_tokens=10000,
max_cost_dollars=0.50,
required_output="text"
)
# User context
user_context = {
"user_id": "user_123",
"tenant_id": "tenant_456",
"role": "user",
"environment": "production",
"request_id": "req_789"
}
# Create bounded agent
agent = BoundedAgent(
agent_core=MyAgentCore(),
contract=contract,
user_context=user_context
)
# Run
result = agent.run("How do I reset my password?")
# Check result
if result.get("error"):
print(f"Error: {result['error']}")
if result.get("partial"):
print(f"Partial results: {result['results']}")
else:
print(f"Response: {result['response']}")
What Happens When Limits Are Hit
When any limit is exceeded, the agent stops. It returns partial results. It logs the error. It doesn’t crash.
Users see a clear error message. They can retry with adjusted parameters. They can contact support.
The system stays stable. One bad request doesn’t break everything.
Checklist: Is Your Agent Safely Bounded?
Use this checklist to verify your agent has proper boundaries.
Time and Steps
- Every agent has a max runtime (wall-clock time)
- Every agent has a max step count (tool calls)
- Timeouts are enforced in code, not just prompts
- Partial results are returned when limits are hit
- Errors are logged when limits are exceeded
Tokens and Cost
- Token usage is tracked for every LLM call
- Cost is tracked for every LLM call
- Hard caps prevent exceeding budgets
- Soft caps downgrade models when budgets are low
- Cost estimates are provided before running workflows
Tools and Permissions
- Agents only see tools they’re allowed to use
- Tool permissions are role-based
- Tool permissions are environment-based
- Tool calls are logged with user and tenant context
- Suspicious tool usage patterns trigger alerts
Data and PII
- PII is redacted before sending to models
- Only necessary data is sent to agents
- Identity is kept separate from task context
- Log retention policies are set and automated
- Old logs are deleted automatically
Contracts and Visibility
- Every agent has a defined contract
- Contracts are visible in code
- Contracts are logged for each request
- Limits are documented
- Errors reference the contract that was violated
Putting It Together
- All boundaries are enforced in one wrapper
- Boundaries are tested
- Boundaries can be adjusted without changing agent code
- Monitoring shows when boundaries are hit
- Alerts notify when boundaries are frequently exceeded
Introducing Boundaries to Existing Systems
If you have an existing agent system, add boundaries gradually:
- Start with logging: Add logging first. See what’s happening.
- Add time limits: Set reasonable timeouts. See if anything breaks.
- Add step limits: Limit tool calls. See if agents complete tasks.
- Add token budgets: Track tokens. Set caps. See actual costs.
- Add tool permissions: Restrict tools. See what’s actually needed.
- Add data boundaries: Redact PII. Scope data. See what breaks.
Don’t add everything at once. Add one boundary at a time. Test. Adjust. Repeat.
Conclusion
Hard boundaries make runaway agent behavior impossible. They enforce limits at the system level. They don’t rely on prompts or agent intelligence.
Start with contracts. Define what agents can do. Then enforce limits. Time. Steps. Tokens. Costs. Tools. Data.
Put it all in a wrapper. One place that enforces everything. One place that logs everything.
When limits are hit, fail gracefully. Return partial results. Log errors. Don’t crash.
This isn’t optional. It’s production requirements. Agents without boundaries will cause problems. They’ll cost too much. They’ll run too long. They’ll access things they shouldn’t.
Add boundaries. Test them. Monitor them. Adjust them. Your agents will be safer. Your costs will be predictable. Your users will be protected.
Discussion
Loading comments...