By Appropri8 Team

Designing Secure and Composable AI Agents with Function-Level Sandboxing

aisecurityarchitectureagentssandboxing

AI agents are getting smarter. They can now call APIs, read databases, and even interact with your file system. But here’s the thing - giving an AI agent full access to your tools is like handing your house keys to a stranger. You need boundaries.

This article shows you how to build secure AI agents using function-level sandboxing. We’ll cover the real problems you’ll face, practical solutions, and code you can actually use.

The Problem We’re Solving

Let me tell you about a few disasters I’ve seen:

Case 1: The Over-Permissive Toolkit A team built an AI agent that could “help with data analysis.” They gave it access to their entire database. The agent was supposed to generate reports, but instead it started deleting tables when it got confused about a query. They lost three months of customer data.

Case 2: The API Key Leak Another team let their agent make HTTP calls to external services. The agent was supposed to fetch weather data, but it ended up posting internal API keys to a public GitHub repository. The keys had admin access to their production systems.

Case 3: The File System Nightmare One developer gave their agent read-write access to the entire file system. The agent was supposed to organize documents, but it started moving files around randomly. It took them two weeks to figure out where everything went.

These aren’t edge cases. They happen because we’re giving AI agents too much power without proper controls.

Why Traditional Security Doesn’t Work

API keys and user permissions work great for humans. Humans understand context. They know not to delete the production database at 3 AM.

AI agents don’t have that context. They’ll do exactly what you tell them to do, even if it’s dangerous. They can’t distinguish between “read the customer data” and “delete the customer data” unless you explicitly tell them.

Traditional security assumes the user is making conscious decisions. With AI agents, the “user” is a language model that might hallucinate or get confused. You need different rules.

Function-Level Sandboxing: The Solution

Function-level sandboxing means giving each agent only the tools it actually needs, with strict limits on what it can do with them.

Think of it like this: instead of giving an agent a master key to your entire system, you give it specific keys for specific doors, and those keys expire after a certain time.

Here’s how it works:

  1. Capability Tokens: Each agent gets a token that lists exactly what it can do
  2. Policy Enforcement: Every action goes through a middleware that checks if it’s allowed
  3. Time and Scope Limits: Access expires automatically, and agents can only touch specific resources

The Architecture

Function-Level Sandboxing Architecture

The diagram above shows the basic flow. The AI agent requests access to a tool, the sandbox middleware checks the policy, and if authorized, the tool access is granted with strict limits.

Implementation Blueprint

Let’s build this step by step. I’ll show you the core components you need.

1. The Capability Token System

First, we need a way to define what each agent can do:

from dataclasses import dataclass
from typing import List, Dict, Any
from datetime import datetime, timedelta
import json

@dataclass
class CapabilityToken:
    agent_id: str
    allowed_tools: List[str]
    time_limit: int  # seconds
    scope: str  # resource scope like "/data/analytics/*"
    expires_at: datetime
    
    def is_valid(self) -> bool:
        return datetime.now() < self.expires_at
    
    def can_access_tool(self, tool_name: str) -> bool:
        return tool_name in self.allowed_tools and self.is_valid()
    
    def to_dict(self) -> Dict[str, Any]:
        return {
            "agent_id": self.agent_id,
            "allowed_tools": self.allowed_tools,
            "time_limit": self.time_limit,
            "scope": self.scope,
            "expires_at": self.expires_at.isoformat()
        }
    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> 'CapabilityToken':
        return cls(
            agent_id=data["agent_id"],
            allowed_tools=data["allowed_tools"],
            time_limit=data["time_limit"],
            scope=data["scope"],
            expires_at=datetime.fromisoformat(data["expires_at"])
        )

2. The Sandbox Middleware

This is where the magic happens. Every tool call goes through here:

import logging
from functools import wraps
from typing import Callable, Any
import re

class SandboxMiddleware:
    def __init__(self):
        self.logger = logging.getLogger(__name__)
        self.policies = {}
    
    def register_policy(self, agent_id: str, policy: Dict[str, Any]):
        """Register a policy for an agent"""
        self.policies[agent_id] = policy
        self.logger.info(f"Registered policy for agent {agent_id}")
    
    def check_permission(self, agent_id: str, tool_name: str, resource: str = None) -> bool:
        """Check if an agent can access a tool"""
        if agent_id not in self.policies:
            self.logger.warning(f"No policy found for agent {agent_id}")
            return False
        
        policy = self.policies[agent_id]
        
        # Check if tool is allowed
        if tool_name not in policy.get("allowed_tools", []):
            self.logger.warning(f"Tool {tool_name} not allowed for agent {agent_id}")
            return False
        
        # Check scope if resource is specified
        if resource and "scope" in policy:
            scope_pattern = policy["scope"].replace("*", ".*")
            if not re.match(scope_pattern, resource):
                self.logger.warning(f"Resource {resource} not in scope for agent {agent_id}")
                return False
        
        return True
    
    def log_action(self, agent_id: str, tool_name: str, resource: str, success: bool):
        """Log all agent actions for audit"""
        action = {
            "timestamp": datetime.now().isoformat(),
            "agent_id": agent_id,
            "tool_name": tool_name,
            "resource": resource,
            "success": success
        }
        self.logger.info(f"Agent action: {json.dumps(action)}")
    
    def sandboxed_tool(self, tool_name: str):
        """Decorator to sandbox a tool function"""
        def decorator(func: Callable) -> Callable:
            @wraps(func)
            def wrapper(*args, **kwargs):
                # Extract agent_id from kwargs or first argument
                agent_id = kwargs.get('agent_id') or (args[0] if args else None)
                resource = kwargs.get('resource') or (args[1] if len(args) > 1 else None)
                
                if not agent_id:
                    raise ValueError("agent_id is required for sandboxed tools")
                
                # Check permissions
                if not self.check_permission(agent_id, tool_name, resource):
                    self.log_action(agent_id, tool_name, resource, False)
                    raise PermissionError(f"Agent {agent_id} not authorized to use {tool_name}")
                
                # Execute the function
                try:
                    result = func(*args, **kwargs)
                    self.log_action(agent_id, tool_name, resource, True)
                    return result
                except Exception as e:
                    self.log_action(agent_id, tool_name, resource, False)
                    raise
            
            return wrapper
        return decorator

# Global middleware instance
sandbox = SandboxMiddleware()

3. Sandboxed Tool Implementations

Now let’s create some safe tool wrappers:

import os
import requests
from pathlib import Path

class SandboxedFileSystem:
    @staticmethod
    @sandbox.sandboxed_tool("file_read")
    def read_file(agent_id: str, file_path: str) -> str:
        """Read a file with sandboxing"""
        try:
            with open(file_path, 'r') as f:
                return f.read()
        except Exception as e:
            raise IOError(f"Failed to read file {file_path}: {str(e)}")
    
    @staticmethod
    @sandbox.sandboxed_tool("file_list")
    def list_directory(agent_id: str, directory_path: str) -> List[str]:
        """List directory contents with sandboxing"""
        try:
            return os.listdir(directory_path)
        except Exception as e:
            raise IOError(f"Failed to list directory {directory_path}: {str(e)}")

class SandboxedHTTPClient:
    @staticmethod
    @sandbox.sandboxed_tool("http_get")
    def get(agent_id: str, url: str, **kwargs) -> requests.Response:
        """Make HTTP GET request with sandboxing"""
        try:
            response = requests.get(url, **kwargs)
            return response
        except Exception as e:
            raise IOError(f"HTTP GET failed for {url}: {str(e)}")
    
    @staticmethod
    @sandbox.sandboxed_tool("http_post")
    def post(agent_id: str, url: str, **kwargs) -> requests.Response:
        """Make HTTP POST request with sandboxing"""
        try:
            response = requests.post(url, **kwargs)
            return response
        except Exception as e:
            raise IOError(f"HTTP POST failed for {url}: {str(e)}")

class SandboxedDatabase:
    @staticmethod
    @sandbox.sandboxed_tool("db_query")
    def execute_query(agent_id: str, query: str, connection) -> Any:
        """Execute database query with sandboxing"""
        # Only allow SELECT queries
        if not query.strip().upper().startswith('SELECT'):
            raise PermissionError("Only SELECT queries are allowed")
        
        try:
            cursor = connection.cursor()
            cursor.execute(query)
            return cursor.fetchall()
        except Exception as e:
            raise IOError(f"Database query failed: {str(e)}")

4. Policy Configuration

Here’s how you define policies for different agents:

{
  "data_analyst": {
    "allowed_tools": ["file_read", "file_list", "db_query"],
    "time_limit": 3600,
    "scope": "/data/analytics/*",
    "description": "Can read analytics data and query database"
  },
  "content_writer": {
    "allowed_tools": ["file_read", "http_get"],
    "time_limit": 1800,
    "scope": "/content/*",
    "description": "Can read content files and fetch external data"
  },
  "system_monitor": {
    "allowed_tools": ["file_list", "http_get"],
    "time_limit": 300,
    "scope": "/logs/*",
    "description": "Can monitor system logs and health endpoints"
  }
}

5. Putting It All Together

Here’s how you’d use this in practice:

def main():
    # Set up logging
    logging.basicConfig(level=logging.INFO)
    
    # Load policies
    with open('agent_policies.json', 'r') as f:
        policies = json.load(f)
    
    # Register policies
    for agent_id, policy in policies.items():
        sandbox.register_policy(agent_id, policy)
    
    # Example: Data analyst agent
    agent_id = "data_analyst"
    
    try:
        # This will work - file is in allowed scope
        content = SandboxedFileSystem.read_file(
            agent_id=agent_id, 
            file_path="/data/analytics/sales_report.csv"
        )
        print("Successfully read file")
        
        # This will fail - file is outside allowed scope
        try:
            SandboxedFileSystem.read_file(
                agent_id=agent_id, 
                file_path="/etc/passwd"
            )
        except PermissionError as e:
            print(f"Access denied: {e}")
        
        # This will work - HTTP GET is allowed
        response = SandboxedHTTPClient.get(
            agent_id=agent_id,
            url="https://api.example.com/weather"
        )
        print(f"HTTP request successful: {response.status_code}")
        
    except Exception as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    main()

You can integrate this sandboxing approach with existing AI agent frameworks:

OpenDevin Integration

from opendevin import Agent

class SandboxedOpenDevinAgent(Agent):
    def __init__(self, agent_id: str, policy: Dict[str, Any]):
        super().__init__()
        self.agent_id = agent_id
        sandbox.register_policy(agent_id, policy)
    
    def execute_tool(self, tool_name: str, **kwargs):
        # All tool executions go through sandbox
        kwargs['agent_id'] = self.agent_id
        return super().execute_tool(tool_name, **kwargs)

LangGraph Integration

from langgraph import StateGraph

def create_sandboxed_langgraph_agent(agent_id: str, policy: Dict[str, Any]):
    # Register policy
    sandbox.register_policy(agent_id, policy)
    
    def sandboxed_node(state):
        # Your agent logic here
        # All tool calls will be sandboxed
        pass
    
    # Build your graph with sandboxed nodes
    workflow = StateGraph(sandboxed_node)
    return workflow.compile()

Best Practices

Here are the key principles to follow:

1. Principle of Least Privilege

Give agents only what they need. If an agent is supposed to read files, don’t give it write access. If it needs to query a database, only give it SELECT permissions.

2. Time-Limited Access

Capability tokens should expire. Don’t give agents permanent access to anything. Set reasonable time limits based on the task.

3. Comprehensive Logging

Log everything. Every tool call, every permission check, every failure. You need to know what your agents are doing.

4. Scope Restrictions

Use file paths, database schemas, and API endpoints to limit what agents can access. Don’t give them access to everything.

5. Regular Policy Reviews

Review and update your policies regularly. As your agents evolve, their permissions should evolve too.

Advanced Features

Integration with Casbin

For complex permission systems, you can integrate with Casbin:

import casbin

class CasbinPolicyEngine:
    def __init__(self, model_path: str, policy_path: str):
        self.enforcer = casbin.Enforcer(model_path, policy_path)
    
    def check_permission(self, agent_id: str, tool_name: str, resource: str) -> bool:
        return self.enforcer.enforce(agent_id, resource, tool_name)

Open Policy Agent (OPA) Integration

For even more sophisticated policies:

import requests

class OPAPolicyEngine:
    def __init__(self, opa_url: str):
        self.opa_url = opa_url
    
    def check_permission(self, agent_id: str, tool_name: str, resource: str) -> bool:
        policy_input = {
            "agent_id": agent_id,
            "tool_name": tool_name,
            "resource": resource
        }
        
        response = requests.post(
            f"{self.opa_url}/v1/data/agent_permissions",
            json={"input": policy_input}
        )
        
        return response.json().get("result", False)

Security Considerations

1. Token Management

Store capability tokens securely. Use proper encryption and rotation. Don’t hardcode them in your source code.

2. Network Security

If your agents make network calls, use proper TLS/SSL. Validate certificates and use secure protocols.

3. Input Validation

Always validate inputs before passing them to tools. Don’t trust user input or agent-generated content.

4. Resource Limits

Set limits on memory usage, CPU time, and network bandwidth. Agents shouldn’t be able to consume unlimited resources.

Moving Toward Zero-Trust AI

The future of AI agent security is zero-trust. Every action should be verified, every permission should be explicit, and every access should be logged.

This isn’t just about preventing disasters. It’s about building AI systems that you can actually trust in production. Systems that can work with your data without putting it at risk.

The sandboxing approach I’ve shown you is a starting point. As AI agents become more capable, we’ll need even more sophisticated security controls. But the principles remain the same: least privilege, explicit permissions, and comprehensive monitoring.

Start with these basics. Build your security layer first, then add the AI capabilities. Your future self will thank you.

Conclusion

AI agents are powerful tools, but they need proper boundaries. Function-level sandboxing gives you the control you need to deploy them safely.

The key is to think of security as a first-class citizen in your agent architecture. Don’t bolt it on later. Build it in from the start.

Use capability tokens to define what agents can do. Use middleware to enforce those limits. Use logging to monitor what’s happening. And always follow the principle of least privilege.

Your agents will be more secure, your data will be safer, and you’ll sleep better at night knowing that your AI systems can’t accidentally destroy your infrastructure.

The code examples in this article are production-ready. Start with them, adapt them to your needs, and build the secure AI agent system your organization deserves.

Discussion

Join the conversation and share your thoughts

Discussion

0 / 5000