Nov 13, 2025

By Appropri8 Team

Hard Boundaries for AI Agents: Time, Budget, and Permission Controls in Production

llmaiagentsproductionsafetyboundariestimeoutsbudgetspermissionspythonfastapisecurity

View sample code on GitHub https://github.com/appropri8/sample-code/tree/main/2025/11/13/hard-boundaries-ai-agents

Hard Boundaries Architecture

You build an AI agent. It works in testing. You deploy it. Then it runs for hours. Or it calls expensive APIs thousands of times. Or it accesses data it shouldn’t.

Agents are good at chaining actions. They’re bad at knowing when to stop.

This article shows how to add hard boundaries that make runaway behavior impossible. Time limits. Token budgets. Tool permissions. Data scoping. These aren’t optional features. They’re production requirements.

Why “Hard Boundaries” Matter for Agents

Agents make decisions. They choose tools. They call APIs. They access data. Without limits, they can:

Run forever in loops
Spend unlimited money on API calls
Access tools or data they don’t need
Expose user data to models that shouldn’t see it

These aren’t edge cases. They happen in production.

Real Risks

Infinite loops:

An agent tries to solve a problem. It calls a tool. The tool returns an error. The agent tries again. Same error. It keeps trying. The request runs for hours. Your server locks up.

Cost blow-ups:

An agent needs to search a database. It doesn’t find results. It searches again with different parameters. And again. And again. Each search costs money. One request can cost hundreds of dollars.

Over-permissioned tools:

An agent has access to all your tools. It only needs to read data. But it can also delete data. It makes a mistake. It deletes something important. You can’t undo it.

Data leaks:

An agent processes user requests. It sends full user data to the model. The model shouldn’t see PII. But it does. You’ve exposed sensitive information.

The Goal

Make runaway behavior impossible by design. Not “unlikely.” Not “usually prevented.” Impossible.

Hard boundaries enforce limits at the system level. They don’t rely on the agent being smart. They don’t rely on prompts being perfect. They enforce limits in code.

If an agent tries to exceed a limit, the system stops it. Not the agent. The system.

Define Your Agent’s Contract

Before you add boundaries, define what your agent does. Turn vague “autonomy” into a clear contract.

The Contract

Input: What the agent receives. User query. Context. Parameters.

Allowed tools: What the agent can touch. Not “all tools.” Specific tools for this agent.

Outputs: What the agent must return. Final answer. Structured data. Error message.

Limits: Time. Steps. Tokens. Budget.

Example Contract

from typing import List, Dict, Any
from dataclasses import dataclass

@dataclass
class AgentContract:
    """Defines what an agent can do"""
    name: str
    allowed_tools: List[str]
    max_runtime_seconds: int
    max_steps: int
    max_tokens: int
    max_cost_dollars: float
    required_output: str  # "text", "json", "structured"
    
    def validate_tool(self, tool_name: str) -> bool:
        """Check if tool is allowed"""
        return tool_name in self.allowed_tools

Why Contracts Matter

Contracts make limits visible. They’re in code. They’re in logs. They’re in documentation.

When an agent fails, you know why. It exceeded a limit. The limit is defined. You can adjust it.

Without contracts, limits are hidden. They’re in prompts. They’re in comments. They’re forgotten.

Making Contracts Visible

Put contracts in code:

# Agent contract for support bot
SUPPORT_BOT_CONTRACT = AgentContract(
    name="support_bot",
    allowed_tools=["search_kb", "create_ticket", "get_user_info"],
    max_runtime_seconds=30,
    max_steps=10,
    max_tokens=10000,
    max_cost_dollars=0.50,
    required_output="text"
)

Log contracts:

def log_contract(contract: AgentContract, request_id: str):
    """Log the agent contract for this request"""
    log_entry = {
        "request_id": request_id,
        "agent_name": contract.name,
        "allowed_tools": contract.allowed_tools,
        "limits": {
            "max_runtime_seconds": contract.max_runtime_seconds,
            "max_steps": contract.max_steps,
            "max_tokens": contract.max_tokens,
            "max_cost_dollars": contract.max_cost_dollars
        }
    }
    print(f"CONTRACT: {json.dumps(log_entry)}")

When something goes wrong, you can see what limits were set. You can see if they were too strict or too loose.

Time and Step Limits

Agents can run forever. They can call tools in loops. They can retry failed operations indefinitely.

Time limits stop this. Step limits stop this.

Max Wall-Clock Time

Set a maximum runtime per request. If the agent runs longer, kill it.

import signal
import time
from contextlib import contextmanager

class TimeoutError(Exception):
    pass

@contextmanager
def timeout(seconds: int):
    """Kill operation after timeout"""
    def timeout_handler(signum, frame):
        raise TimeoutError(f"Operation exceeded {seconds} seconds")
    
    signal.signal(signal.SIGALRM, timeout_handler)
    signal.alarm(seconds)
    try:
        yield
    finally:
        signal.alarm(0)

# Usage
try:
    with timeout(30):
        result = run_agent(user_input)
except TimeoutError as e:
    result = {"error": str(e), "partial": True}

Max Planning Depth

Limit how many tool calls an agent can make. After N calls, force it to return.

class StepBudget:
    """Track and enforce step limits"""
    def __init__(self, max_steps: int):
        self.max_steps = max_steps
        self.steps_taken = 0
    
    def check(self) -> bool:
        """Check if more steps are allowed"""
        return self.steps_taken < self.max_steps
    
    def use_step(self):
        """Record a step"""
        self.steps_taken += 1
        if not self.check():
            raise StepLimitExceeded(f"Exceeded {self.max_steps} steps")

# In agent loop
budget = StepBudget(max_steps=10)

while budget.check():
    tool_call = agent.choose_tool()
    budget.use_step()
    result = call_tool(tool_call)
    agent.update(result)
    
    if agent.is_done():
        break

if not budget.check():
    return {"error": "Step limit exceeded", "partial": True}

Per-Tool Timeouts

Some tools are slow. Set timeouts per tool.

def call_tool_with_timeout(tool_name: str, params: dict, timeout_seconds: int):
    """Call tool with timeout"""
    try:
        with timeout(timeout_seconds):
            return call_tool(tool_name, params)
    except TimeoutError:
        return {"error": f"Tool {tool_name} timed out after {timeout_seconds}s"}

Failing Gracefully

When limits are hit, return partial results. Don’t crash.

def run_agent_with_limits(user_input: str, contract: AgentContract):
    """Run agent with all limits enforced"""
    start_time = time.time()
    step_budget = StepBudget(contract.max_steps)
    results = []
    
    try:
        with timeout(contract.max_runtime_seconds):
            while step_budget.check():
                tool_call = agent.choose_tool()
                
                # Check tool permission
                if not contract.validate_tool(tool_call.name):
                    return {
                        "error": f"Tool {tool_call.name} not allowed",
                        "partial": True,
                        "results": results
                    }
                
                step_budget.use_step()
                
                # Call tool with timeout
                tool_result = call_tool_with_timeout(
                    tool_call.name,
                    tool_call.params,
                    timeout_seconds=5
                )
                
                results.append(tool_result)
                agent.update(tool_result)
                
                if agent.is_done():
                    break
        
        return {"results": results, "complete": True}
        
    except TimeoutError:
        return {
            "error": "Runtime limit exceeded",
            "partial": True,
            "results": results
        }
    except StepLimitExceeded:
        return {
            "error": "Step limit exceeded",
            "partial": True,
            "results": results
        }

Partial results are better than nothing. Users can see what was done. They can retry if needed.

Token and Cost Budgets

LLM calls cost money. Agents can make many calls. One request can cost hundreds of dollars.

Track tokens. Track costs. Enforce budgets.

What to Track

Prompt tokens: Input to the model. User query. System prompt. Context.

Response tokens: Output from the model. Generated text.

Tool call tokens: When tools call LLMs. Nested calls add up.

Simple Token Counter

import tiktoken

class TokenBudget:
    """Track and enforce token limits"""
    def __init__(self, max_tokens: int):
        self.max_tokens = max_tokens
        self.tokens_used = 0
        self.encoding = tiktoken.encoding_for_model("gpt-4")
    
    def count_tokens(self, text: str) -> int:
        """Count tokens in text"""
        return len(self.encoding.encode(text))
    
    def add_tokens(self, tokens: int):
        """Add tokens to budget"""
        self.tokens_used += tokens
        if self.tokens_used > self.max_tokens:
            raise TokenBudgetExceeded(
                f"Exceeded {self.max_tokens} tokens (used {self.tokens_used})"
            )
    
    def check(self, text: str) -> bool:
        """Check if text would exceed budget"""
        tokens_needed = self.count_tokens(text)
        return (self.tokens_used + tokens_needed) <= self.max_tokens
    
    def remaining(self) -> int:
        """Get remaining tokens"""
        return max(0, self.max_tokens - self.tokens_used)

Cost Tracking

# Pricing per 1K tokens (example)
PRICING = {
    "gpt-4": {"input": 0.03, "output": 0.06},
    "gpt-3.5-turbo": {"input": 0.0015, "output": 0.002}
}

class CostBudget:
    """Track and enforce cost limits"""
    def __init__(self, max_cost_dollars: float):
        self.max_cost_dollars = max_cost_dollars
        self.cost_used = 0.0
    
    def add_cost(self, model: str, input_tokens: int, output_tokens: int):
        """Add cost to budget"""
        if model not in PRICING:
            raise ValueError(f"Unknown model: {model}")
        
        input_cost = (input_tokens / 1000) * PRICING[model]["input"]
        output_cost = (output_tokens / 1000) * PRICING[model]["output"]
        total_cost = input_cost + output_cost
        
        self.cost_used += total_cost
        
        if self.cost_used > self.max_cost_dollars:
            raise CostBudgetExceeded(
                f"Exceeded ${self.max_cost_dollars} budget (used ${self.cost_used:.2f})"
            )
    
    def remaining(self) -> float:
        """Get remaining budget"""
        return max(0, self.max_cost_dollars - self.cost_used)

Hard Caps vs Soft Caps

Hard caps: Refuse to add more context or call more tools. Stop immediately.

def call_llm_with_budget(prompt: str, token_budget: TokenBudget, cost_budget: CostBudget):
    """Call LLM if budget allows"""
    tokens_needed = token_budget.count_tokens(prompt)
    
    if not token_budget.check(prompt):
        raise TokenBudgetExceeded("Not enough tokens remaining")
    
    # Make call
    response = openai_client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    
    input_tokens = response.usage.prompt_tokens
    output_tokens = response.usage.completion_tokens
    
    # Update budgets
    token_budget.add_tokens(input_tokens + output_tokens)
    cost_budget.add_cost("gpt-4", input_tokens, output_tokens)
    
    return response

Soft caps: Switch to cheaper models or shorter responses. Continue with constraints.

def call_llm_with_soft_cap(prompt: str, token_budget: TokenBudget, cost_budget: CostBudget):
    """Call LLM, downgrade if needed"""
    # Try expensive model first
    if cost_budget.remaining() > 1.0:
        model = "gpt-4"
    else:
        model = "gpt-3.5-turbo"  # Cheaper fallback
    
    # Truncate prompt if needed
    max_tokens = token_budget.remaining()
    if token_budget.count_tokens(prompt) > max_tokens:
        # Truncate to fit
        prompt = truncate_to_tokens(prompt, max_tokens - 100)  # Leave room for response
    
    response = openai_client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        max_tokens=min(500, token_budget.remaining())  # Limit response length
    )
    
    input_tokens = response.usage.prompt_tokens
    output_tokens = response.usage.completion_tokens
    
    token_budget.add_tokens(input_tokens + output_tokens)
    cost_budget.add_cost(model, input_tokens, output_tokens)
    
    return response

Predictable Costs

Design workflows with predictable costs. “This workflow costs between $0.10 and $0.50.”

def estimate_workflow_cost(workflow_steps: list) -> dict:
    """Estimate cost range for workflow"""
    min_cost = 0.0
    max_cost = 0.0
    
    for step in workflow_steps:
        # Estimate based on step type
        if step["type"] == "llm_call":
            min_cost += 0.01  # Minimal prompt
            max_cost += 0.10  # Large context
        elif step["type"] == "tool_call":
            min_cost += 0.0  # Free tool
            max_cost += 0.05  # Expensive API
    
    return {
        "min_cost": min_cost,
        "max_cost": max_cost,
        "estimated_cost": (min_cost + max_cost) / 2
    }

Users can see cost estimates before running. They can set budgets accordingly.

Tool and Permission Scoping

Agents shouldn’t see tools they don’t need. A read-only agent shouldn’t have delete permissions. A user-facing agent shouldn’t have admin tools.

Principle: Minimum Required Access

Give agents the minimum tools they need. Nothing more.

Role-Based Tool Sets

Different roles get different tools.

TOOL_PERMISSIONS = {
    "user": ["search_kb", "create_ticket", "get_user_info"],
    "admin": ["search_kb", "create_ticket", "get_user_info", "delete_ticket", "modify_user"],
    "readonly": ["search_kb", "get_user_info"]
}

def get_tools_for_role(role: str) -> List[str]:
    """Get allowed tools for role"""
    return TOOL_PERMISSIONS.get(role, [])

Environment-Based Tool Sets

Different environments get different tools. Production has restrictions. Development has more access.

ENV_TOOLS = {
    "production": ["search_kb", "create_ticket"],
    "staging": ["search_kb", "create_ticket", "delete_ticket"],
    "development": ["search_kb", "create_ticket", "delete_ticket", "reset_database"]
}

def get_tools_for_env(env: str) -> List[str]:
    """Get allowed tools for environment"""
    return ENV_TOOLS.get(env, [])

Data-Scoped Tools

Tools are scoped to data. Per-tenant. Per-user.

def get_tools_for_context(user_id: str, tenant_id: str) -> List[str]:
    """Get tools scoped to user and tenant"""
    base_tools = ["search_kb", "create_ticket"]
    
    # Add tenant-specific tools
    if tenant_id == "enterprise":
        base_tools.append("advanced_analytics")
    
    # Add user-specific tools
    if is_admin(user_id):
        base_tools.append("delete_ticket")
    
    return base_tools

Tool Capability Matrix

Document who can call what and when.

TOOL_MATRIX = {
    "search_kb": {
        "roles": ["user", "admin", "readonly"],
        "environments": ["production", "staging", "development"],
        "rate_limit": "100/hour"
    },
    "delete_ticket": {
        "roles": ["admin"],
        "environments": ["staging", "development"],
        "rate_limit": "10/hour",
        "requires_approval": True
    },
    "reset_database": {
        "roles": ["admin"],
        "environments": ["development"],
        "rate_limit": "1/day",
        "requires_approval": True
    }
}

def can_call_tool(tool_name: str, role: str, env: str) -> bool:
    """Check if tool can be called"""
    if tool_name not in TOOL_MATRIX:
        return False
    
    tool = TOOL_MATRIX[tool_name]
    return role in tool["roles"] and env in tool["environments"]

Logging Tool Calls

Log every tool call with context. User. Tenant. Role. Time.

def log_tool_call(tool_name: str, params: dict, user_context: dict):
    """Log tool call for audit"""
    log_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "tool": tool_name,
        "params": sanitize_params(params),  # Remove sensitive data
        "user_id": user_context.get("user_id"),
        "tenant_id": user_context.get("tenant_id"),
        "role": user_context.get("role"),
        "environment": user_context.get("environment")
    }
    
    # Store in audit log
    audit_log.append(log_entry)
    
    # Alert on suspicious patterns
    if is_suspicious_pattern(tool_name, user_context):
        alert_security_team(log_entry)

This creates an audit trail. You can see who called what. You can detect abuse.

Data Boundaries and PII Handling

Agents process user data. They shouldn’t see everything. They shouldn’t keep everything.

Limit What Reaches the Agent

Only send data the agent needs. Not full user profiles. Not complete histories.

def prepare_agent_context(user_id: str, task: str) -> dict:
    """Prepare minimal context for agent"""
    # Get only what's needed
    if task == "support":
        return {
            "user_id": user_id,
            "recent_tickets": get_recent_tickets(user_id, limit=5),
            "account_tier": get_account_tier(user_id)
        }
    elif task == "billing":
        return {
            "user_id": user_id,
            "current_plan": get_current_plan(user_id),
            "billing_history": get_billing_history(user_id, limit=3)
        }
    else:
        return {"user_id": user_id}  # Minimal context

Redaction and Masking

Remove PII before sending to the model.

import re

def redact_pii(text: str) -> str:
    """Remove PII from text"""
    # Email
    text = re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', '[EMAIL]', text)
    
    # Phone
    text = re.sub(r'\b\d{3}-\d{3}-\d{4}\b', '[PHONE]', text)
    
    # Credit card (simple pattern)
    text = re.sub(r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b', '[CARD]', text)
    
    # SSN
    text = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN]', text)
    
    return text

def prepare_safe_input(user_input: str, context: dict) -> str:
    """Prepare input with PII removed"""
    # Redact user input
    safe_input = redact_pii(user_input)
    
    # Redact context
    safe_context = {}
    for key, value in context.items():
        if isinstance(value, str):
            safe_context[key] = redact_pii(value)
        else:
            safe_context[key] = value
    
    return f"User: {safe_input}\nContext: {json.dumps(safe_context)}"

Separate Identity from Task Context

Keep identity separate. Don’t send it to the model unless needed.

class AgentRequest:
    """Request with separated identity and context"""
    def __init__(self, user_id: str, task_context: dict):
        self.user_id = user_id  # Kept separate
        self.task_context = task_context  # Sent to model
    
    def to_prompt(self) -> str:
        """Convert to prompt without identity"""
        # Don't include user_id in prompt
        return json.dumps(self.task_context)
    
    def get_identity(self) -> str:
        """Get identity separately"""
        return self.user_id

Retention Rules

Set how long you keep agent logs. Delete old data.

def cleanup_old_logs(retention_days: int = 30):
    """Delete logs older than retention period"""
    cutoff_date = datetime.utcnow() - timedelta(days=retention_days)
    
    # Delete from database
    delete_query = """
        DELETE FROM agent_logs
        WHERE timestamp < %s
    """
    
    execute_query(delete_query, (cutoff_date,))
    
    # Delete from file storage
    delete_old_files(cutoff_date)

Automate this. Run it daily. Don’t keep data forever.

Putting It Together: A Bounded Agent Wrapper

Combine all boundaries in one place. A wrapper that enforces limits. A wrapper that logs everything.

The BoundedAgent Class

from typing import Optional, Dict, Any
import time
import json

class BoundedAgent:
    """Agent with all boundaries enforced"""
    
    def __init__(
        self,
        agent_core,  # Your agent implementation
        contract: AgentContract,
        user_context: dict
    ):
        self.agent_core = agent_core
        self.contract = contract
        self.user_context = user_context
        
        # Initialize budgets
        self.step_budget = StepBudget(contract.max_steps)
        self.token_budget = TokenBudget(contract.max_tokens)
        self.cost_budget = CostBudget(contract.max_cost_dollars)
        self.start_time = time.time()
        
        # Filter tools by permissions
        self.available_tools = self._filter_tools()
        
        # Log contract
        self._log_contract()
    
    def _filter_tools(self) -> List[str]:
        """Filter tools based on permissions"""
        role = self.user_context.get("role", "user")
        env = self.user_context.get("environment", "production")
        
        # Get tools for role and environment
        role_tools = get_tools_for_role(role)
        env_tools = get_tools_for_env(env)
        
        # Intersection: must be in both
        allowed = set(role_tools) & set(env_tools)
        
        # Also check contract
        allowed = allowed & set(self.contract.allowed_tools)
        
        return list(allowed)
    
    def _log_contract(self):
        """Log the contract for this request"""
        request_id = self.user_context.get("request_id", "unknown")
        log_contract(self.contract, request_id)
    
    def _check_timeout(self):
        """Check if runtime limit exceeded"""
        elapsed = time.time() - self.start_time
        if elapsed > self.contract.max_runtime_seconds:
            raise TimeoutError(f"Runtime limit exceeded: {elapsed:.2f}s")
    
    def _check_budgets(self):
        """Check if any budget exceeded"""
        if not self.step_budget.check():
            raise StepLimitExceeded("Step limit exceeded")
        
        if self.token_budget.remaining() < 100:
            raise TokenBudgetExceeded("Token budget nearly exhausted")
        
        if self.cost_budget.remaining() < 0.01:
            raise CostBudgetExceeded("Cost budget nearly exhausted")
    
    def _log_tool_call(self, tool_name: str, params: dict, result: dict):
        """Log tool call for audit"""
        log_tool_call(tool_name, params, self.user_context)
    
    def run(self, user_input: str) -> Dict[str, Any]:
        """Run agent with all boundaries enforced"""
        request_id = self.user_context.get("request_id", f"req_{int(time.time())}")
        
        try:
            # Prepare safe input
            safe_input = prepare_safe_input(user_input, self.user_context.get("context", {}))
            
            # Run agent loop
            results = []
            
            while True:
                # Check limits
                self._check_timeout()
                self._check_budgets()
                
                if not self.step_budget.check():
                    break
                
                # Agent chooses tool
                tool_choice = self.agent_core.choose_tool(safe_input, self.available_tools)
                
                if tool_choice is None:
                    # Agent is done
                    break
                
                tool_name = tool_choice["name"]
                tool_params = tool_choice["params"]
                
                # Validate tool permission
                if tool_name not in self.available_tools:
                    return {
                        "error": f"Tool {tool_name} not allowed",
                        "request_id": request_id,
                        "partial": True,
                        "results": results
                    }
                
                # Use step
                self.step_budget.use_step()
                
                # Call tool with timeout
                try:
                    tool_result = call_tool_with_timeout(tool_name, tool_params, timeout_seconds=5)
                except TimeoutError:
                    tool_result = {"error": f"Tool {tool_name} timed out"}
                
                # Log tool call
                self._log_tool_call(tool_name, tool_params, tool_result)
                
                results.append({
                    "tool": tool_name,
                    "result": tool_result
                })
                
                # Update agent
                self.agent_core.update(tool_result)
                
                # Check if done
                if self.agent_core.is_done():
                    break
            
            # Generate final response
            final_response = self.agent_core.generate_response()
            
            # Log final response
            self._log_response(request_id, final_response, results)
            
            return {
                "response": final_response,
                "request_id": request_id,
                "complete": True,
                "results": results,
                "budgets": {
                    "steps_used": self.step_budget.steps_taken,
                    "tokens_used": self.token_budget.tokens_used,
                    "cost_used": self.cost_budget.cost_used
                }
            }
            
        except TimeoutError as e:
            return self._handle_error("timeout", str(e), results, request_id)
        except StepLimitExceeded as e:
            return self._handle_error("step_limit", str(e), results, request_id)
        except TokenBudgetExceeded as e:
            return self._handle_error("token_budget", str(e), results, request_id)
        except CostBudgetExceeded as e:
            return self._handle_error("cost_budget", str(e), results, request_id)
        except Exception as e:
            return self._handle_error("unknown", str(e), results, request_id)
    
    def _handle_error(self, error_type: str, error_msg: str, results: list, request_id: str):
        """Handle errors gracefully"""
        self._log_error(request_id, error_type, error_msg)
        
        return {
            "error": error_msg,
            "error_type": error_type,
            "request_id": request_id,
            "partial": True,
            "results": results,
            "budgets": {
                "steps_used": self.step_budget.steps_taken,
                "tokens_used": self.token_budget.tokens_used,
                "cost_used": self.cost_budget.cost_used
            }
        }
    
    def _log_response(self, request_id: str, response: str, results: list):
        """Log final response"""
        log_entry = {
            "request_id": request_id,
            "response": response,
            "results_count": len(results),
            "budgets": {
                "steps_used": self.step_budget.steps_taken,
                "tokens_used": self.token_budget.tokens_used,
                "cost_used": self.cost_budget.cost_used
            },
            "timestamp": datetime.utcnow().isoformat()
        }
        print(f"RESPONSE: {json.dumps(log_entry)}")
    
    def _log_error(self, request_id: str, error_type: str, error_msg: str):
        """Log error"""
        log_entry = {
            "request_id": request_id,
            "error_type": error_type,
            "error": error_msg,
            "timestamp": datetime.utcnow().isoformat()
        }
        print(f"ERROR: {json.dumps(log_entry)}")

Usage Example

# Define contract
contract = AgentContract(
    name="support_bot",
    allowed_tools=["search_kb", "create_ticket"],
    max_runtime_seconds=30,
    max_steps=10,
    max_tokens=10000,
    max_cost_dollars=0.50,
    required_output="text"
)

# User context
user_context = {
    "user_id": "user_123",
    "tenant_id": "tenant_456",
    "role": "user",
    "environment": "production",
    "request_id": "req_789"
}

# Create bounded agent
agent = BoundedAgent(
    agent_core=MyAgentCore(),
    contract=contract,
    user_context=user_context
)

# Run
result = agent.run("How do I reset my password?")

# Check result
if result.get("error"):
    print(f"Error: {result['error']}")
    if result.get("partial"):
        print(f"Partial results: {result['results']}")
else:
    print(f"Response: {result['response']}")

What Happens When Limits Are Hit

When any limit is exceeded, the agent stops. It returns partial results. It logs the error. It doesn’t crash.

Users see a clear error message. They can retry with adjusted parameters. They can contact support.

The system stays stable. One bad request doesn’t break everything.

Checklist: Is Your Agent Safely Bounded?

Use this checklist to verify your agent has proper boundaries.

Time and Steps

Every agent has a max runtime (wall-clock time)
Every agent has a max step count (tool calls)
Timeouts are enforced in code, not just prompts
Partial results are returned when limits are hit
Errors are logged when limits are exceeded

Tokens and Cost

Token usage is tracked for every LLM call
Cost is tracked for every LLM call
Hard caps prevent exceeding budgets
Soft caps downgrade models when budgets are low
Cost estimates are provided before running workflows

Tools and Permissions

Agents only see tools they’re allowed to use
Tool permissions are role-based
Tool permissions are environment-based
Tool calls are logged with user and tenant context
Suspicious tool usage patterns trigger alerts

Data and PII

PII is redacted before sending to models
Only necessary data is sent to agents
Identity is kept separate from task context
Log retention policies are set and automated
Old logs are deleted automatically

Contracts and Visibility

Every agent has a defined contract
Contracts are visible in code
Contracts are logged for each request
Limits are documented
Errors reference the contract that was violated

Putting It Together

All boundaries are enforced in one wrapper
Boundaries are tested
Boundaries can be adjusted without changing agent code
Monitoring shows when boundaries are hit
Alerts notify when boundaries are frequently exceeded

Introducing Boundaries to Existing Systems

If you have an existing agent system, add boundaries gradually:

Start with logging: Add logging first. See what’s happening.
Add time limits: Set reasonable timeouts. See if anything breaks.
Add step limits: Limit tool calls. See if agents complete tasks.
Add token budgets: Track tokens. Set caps. See actual costs.
Add tool permissions: Restrict tools. See what’s actually needed.
Add data boundaries: Redact PII. Scope data. See what breaks.

Don’t add everything at once. Add one boundary at a time. Test. Adjust. Repeat.

Conclusion

Hard boundaries make runaway agent behavior impossible. They enforce limits at the system level. They don’t rely on prompts or agent intelligence.

Start with contracts. Define what agents can do. Then enforce limits. Time. Steps. Tokens. Costs. Tools. Data.

Put it all in a wrapper. One place that enforces everything. One place that logs everything.

When limits are hit, fail gracefully. Return partial results. Log errors. Don’t crash.

This isn’t optional. It’s production requirements. Agents without boundaries will cause problems. They’ll cost too much. They’ll run too long. They’ll access things they shouldn’t.

Add boundaries. Test them. Monitor them. Adjust them. Your agents will be safer. Your costs will be predictable. Your users will be protected.

Sign In

Hard Boundaries for AI Agents: Time, Budget, and Permission Controls in Production

Stay Updated

Discussion

Discussion

Sign In