By Yusuf Elborey

Put a Gateway in Front of Your Agents: Tool Access Control, Secrets, and Audit Trails

agent-gatewaytool-access-controlpolicysecretsauditdevopsagentic-aigovernancepythonfastapi

Agent Gateway: Policy, Secrets, Audit

Agents that call tools directly are fast to ship. They’re also hard to govern. Credential sprawl, no consistent rate limiting, and “who did what, and why?” become real problems. The fix is a single outbound path for tool calls: an Agent Gateway.

This article is for platform and DevOps folks who need governance without killing velocity. You get one control plane: auth, policy, budgets, logging, and redaction. Think of it as an execution firewall for tool calls.

What Goes Wrong When Agents Call Tools Directly

Three things bite teams first.

Credential sprawl. Every agent or service that calls a tool needs credentials. API keys, OAuth tokens, database URLs. They end up in env vars, config files, or worse—hardcoded. Rotating them is a mess. One leak and you’re revoking everything.

No consistent rate limiting. One agent can hammer an API. Another can spin up 100 tool calls in a loop. Without a central choke point, you can’t enforce per-agent or per-team budgets. Cost and abuse get out of hand.

Hard to audit. When something goes wrong, you need to know: which agent called which tool, with what inputs, and what happened. With direct tool calls, you’re piecing together logs from multiple services. There’s no single place that says “this call was allowed or denied, and here’s why.”

An Agent Gateway fixes this by being the only path agents use to call tools. All control lives there.

Define the Agent Gateway

The Agent Gateway is one outbound path for tool calls. Agents don’t talk to tools directly. They send a request to the gateway. The gateway handles:

  • AuthN/AuthZ — Who is this agent? What role does it have?
  • Policy — Is this tool allowed for this role? Are the inputs valid?
  • Budgets — Rate limits, cost caps, timeouts per agent or team
  • Logging — Every request and decision, with redaction for secrets and PII
  • Secrets — Agents never see raw credentials; the gateway injects them at execution time

You get a clean boundary: agents send signed, typed requests; the gateway enforces policy, resolves secrets, calls the tool, and writes an audit record.

A Clean Tool Contract

Tools should be explicit and strict. That means:

Tool schemas and strict validation. Each tool has a JSON schema. The gateway validates every request against it. Unknown or malformed arguments are rejected before any execution.

Typed inputs, bounded outputs, safe defaults. Inputs are typed (string length, number ranges, enums). Outputs can be size-limited or truncated. Defaults are safe (e.g., read-only unless explicitly allowed).

Deny-by-default tool registry. Only tools that are registered and allowed for a given role can be called. If a tool isn’t in the registry, the request is denied. No “allow everything” mode.

Example contract:

# Tool registry entry: name, schema, allowed_roles, requires_approval
TOOL_REGISTRY = {
    "read_file": {
        "schema": {
            "type": "object",
            "properties": {"path": {"type": "string", "maxLength": 512}},
            "required": ["path"],
            "additionalProperties": False,
        },
        "allowed_roles": ["read_only_agent", "action_agent"],
        "requires_approval": False,
    },
    "delete_file": {
        "schema": {
            "type": "object",
            "properties": {"path": {"type": "string", "maxLength": 512}},
            "required": ["path"],
            "additionalProperties": False,
        },
        "allowed_roles": ["action_agent"],
        "requires_approval": True,
    },
}

The gateway checks the agent’s role, validates the payload against the schema, and checks whether the tool requires human approval.

Policy Enforcement That’s Actually Usable

Policy should be easy to reason about and change.

Allowlist per agent role. Define roles like read_only_agent and action_agent. Each tool lists which roles can call it. A read-only agent can call read_file but not delete_file. Simple and auditable.

Data rules: block PII exfil, redact logs. Policy can block tools that might exfiltrate PII (e.g., no “send_email” with body containing SSN patterns). All audit logs redact known secret and PII patterns before being written.

Environment rules: staging vs prod. Tools can be tagged by environment. The gateway only allows staging tools in staging and prod tools in prod. Prevents “oops, the agent just ran a delete in prod.”

Example policy (conceptual):

# Allowlist: role -> list of tools
ROLE_ALLOWLIST = {
    "read_only_agent": ["read_file", "search"],
    "action_agent": ["read_file", "search", "write_file", "delete_file"],
}

# Forbidden fields in request body (deny request if present)
FORBIDDEN_FIELDS = ["password", "api_key", "token"]

You can implement this in code or drive it from a config file or Open Policy Agent (OPA). The important part is: one place that says “this role can call these tools, and we never log these fields.”

Secrets and Identity

Agents should never see raw credentials.

Agents never see raw credentials. The agent sends a tool name and arguments. It does not send API keys or passwords. The gateway holds the mapping from “this tool in this context” to the right credential.

Gateway uses workload identity or short-lived tokens. Where possible, the gateway uses workload identity (e.g., service account, IAM role) to call downstream services. When it needs a token, it fetches a short-lived one from a secret manager and injects it only for the duration of the call.

Secret injection only at execution time. Right before calling the tool adapter, the gateway resolves secrets and passes them to the adapter. They are never logged or returned to the agent.

# Agent sends only tool name + args; gateway resolves secrets at execution time
def execute_tool(tool_name: str, args: dict, agent_id: str, role: str):
    if not allow_tool(role, tool_name):
        raise PolicyDenied("Tool not allowed for role")
    secrets = secret_store.get_for_tool(tool_name, agent_id)  # short-lived
    result = tool_adapter.call(tool_name, args, secrets)
    secret_store.revoke(secrets)  # invalidate after use
    return result

This keeps credentials out of agent memory and logs.

Human-in-the-Loop Where It Matters

Not every tool call needs a human. High-risk actions should.

Approval only for high-risk actions. Money movement, deletes, privilege changes—these go into a pending state and require a human approval (or a break-glass flow). Low-risk reads and searches can be automatic.

Queue, timeout, fallback. When approval is required, the request is queued. It has a timeout; if no one approves in time, it can be rejected or escalated. You can define a fallback (e.g., “reject” or “notify and hold”).

Example flow:

  1. Agent sends delete_file request.
  2. Gateway validates and sees requires_approval: True. It writes the request to an approval queue and returns pending_approval with an approval_id.
  3. A human (or automated policy) approves the request using approval_id.
  4. Gateway executes the tool call and returns the result to the agent (or notifies the agent that the action was completed).

So: automatic path for safe tools, approval path for dangerous ones.

Observability and Audit

Every tool call should be traceable and accountable.

Trace every tool call end-to-end. From the moment the gateway receives the request to the moment it returns a result (or error), you have one trace. Agent ID, request ID, tool name, policy decision, and outcome.

Immutable audit logs. Logs include: agent version, policy version, redacted inputs, decision (allow/deny/approval_pending), and outcome. They’re append-only so they can support compliance and forensics.

Cost and rate budgets per agent or team. The gateway can enforce “max N calls per minute” or “max $X per day” per agent or team. When a budget is exceeded, requests are rejected and logged. That gives you both safety and clear visibility.

# Audit log entry (redacted)
{
    "timestamp": "2026-01-30T12:00:00Z",
    "request_id": "req_abc123",
    "agent_id": "agent_xyz",
    "agent_version": "1.2.0",
    "tool": "read_file",
    "policy_decision": "allow",
    "inputs_redacted": {"path": "/workspace/doc.txt"},
    "outcome": "success",
    "duration_ms": 45,
}

Reference Architecture You Can Copy

A minimal layout:

  • Agent → sends signed tool-call envelope (agent_id, version, request_id, tool, args) to the gateway.
  • Gateway → validates envelope, checks policy, applies rate/cost limits, resolves secrets, and either executes directly or queues for approval. It logs every step.
  • Tool adapters → gateway calls adapters (one per tool or per family). Adapters talk to the real systems; they receive secrets only at invocation time.
  • Optional: LLM gateway — If you also want consistent logging and routing for LLM calls, you can put an LLM gateway in front of your provider and have the agent use both: LLM gateway for model calls, Agent Gateway for tool calls.

The sample repo implements a small Tool Gateway in Python/FastAPI with a tool registry, JSON schema validation, per-tool allowlist by agent role, rate limiting, timeouts, and structured audit logs with redaction.

Checklist

Minimal controls to launch:

  • Single entry point: all tool calls go through the gateway
  • Tool registry with JSON schema validation and deny-by-default
  • Role-based allowlist (which roles can call which tools)
  • Secrets resolved by gateway; never passed from agent
  • Structured audit log (request_id, agent_id, tool, decision, outcome) with redaction
  • Rate limiting and timeouts per agent or team

Controls you add after first incidents (you will learn fast):

  • Human approval for high-risk tools (deletes, money, privilege changes)
  • Cost budgets per agent/team
  • Staging vs prod tool separation
  • PII/sensitive-field blocking and redaction rules
  • Signed envelope (agent_id, version, request_id) so you can verify and trace

Code Samples

The GitHub repository contains a complete, runnable Agent Gateway in Python/FastAPI. It includes:

  1. Tool Gateway service — Tool registry with JSON schema validation, per-tool allowlist by agent role, rate limiting, timeouts, and structured audit logs with redaction.
  2. Policy rule example — Deny tools not on allowlist, deny requests containing forbidden fields, require approval for write actions (see policies/tool_policy.rego or the inline Python policy).
  3. Signed tool-call envelope — Agent sends agent_id, version, request_id; gateway verifies and logs them.
  4. Approval flow — Tool call enters pending state, human approves, gateway executes and returns result.

Signed envelope example

Agents send a signed envelope so the gateway can verify and log who is calling:

# Agent sends this; gateway verifies and logs
{
    "agent_id": "agent_deploy_001",
    "agent_version": "1.2.0",
    "request_id": "req_uuid_123",
    "tool": "read_file",
    "arguments": {"path": "/workspace/readme.md"},
}

The gateway checks that agent_id is allowed, looks up the role for that agent, and then runs policy and validation.

Approval-required flow example

For tools marked requires_approval:

# 1. Agent sends delete_file request
# 2. Gateway returns 202 with approval_id
{"status": "pending_approval", "approval_id": "approv_xyz", "message": "Human approval required"}

# 3. Human or system POSTs to /approve with approval_id
# 4. Gateway executes tool, writes audit log, returns result to caller

You get governance without blocking every call. Automatic path for safe tools, approval path for risky ones.

Summary

Put a gateway in front of your agents. One outbound path for tool calls gives you:

  • Policy — Allowlists per role, deny-by-default registry, optional OPA-style rules
  • Secrets — Agents never see credentials; gateway injects at execution time
  • Audit — Every call traced and logged with redaction; immutable logs for compliance

Start with the minimal checklist: single entry point, registry, role allowlist, secret resolution, audit log, rate limits and timeouts. Add approval flows and budgets when you hit your first incident. You’ll learn fast—and the gateway will be the place where you enforce what you learned.

Discussion

Join the conversation and share your thoughts

Discussion

0 / 5000