← All articles Security

Your AI agent should never hold its own keys

OBTO Team · Insights from the Glass Box

Give an AI agent a capability and you have handed it a credential. The agent that reads your Stripe charges needs a Stripe key. The one that files tickets needs a Jira token. The one that answers questions from your database needs a connection string. Every useful agent is, underneath, a small pile of secrets — and the moment one of them lands somewhere the model can see, no amount of prompt engineering will put it back.

Secrets management is the unglamorous half of agent security. It rarely makes the demo. But it is the line between an agent you can point at real systems and one that is a leak waiting to happen. Here is where credentials go wrong, and the single pattern that keeps them out of reach.

Where agent secrets leak

The leaks are rarely dramatic. They are ordinary shortcuts that feel fine right up until the key is somewhere you cannot pull it back from:

None of these need an attacker. A teammate sharing a trace, a screenshot pasted into a ticket, a log forwarded to a third-party dashboard — the key is out, and now you are rotating under pressure.

The one rule: the model never sees the secret

Everything good follows from a single separation. The model decides what to do; your runtime holds the authority to do it. The agent asks for a capability by name. Your server-side code attaches the credential out of band, makes the call, and returns the result. The key and the model never share a room.

Concretely, a tool is server-side code that reads its secret at runtime — not a value baked into the prompt or the client:

// issue_refund — the model picks the tool; the runtime holds the key
async function issueRefund({ charge_id, amount }) {
  const key = secrets.get('STRIPE_SECRET_KEY');   // resolved server-side, never in context
  return await stripe(key).refunds.create({ charge: charge_id, amount });
}

The model emits { tool: "issue_refund", charge_id: "ch_123", amount: 500 } and nothing more. It never sees STRIPE_SECRET_KEY, cannot print it, and cannot be talked into leaking it — because it was never holding it. This is the same discipline behind giving an agent scoped database tools instead of raw SQL: the capability is real, the credential stays yours.

Four practices that keep keys out of reach

  1. Store secrets in a server-side vault. Encrypted at rest, read at runtime by the tool that needs them. Not in the repo, not in the client build, not in an environment variable that ends up in a screenshot. The code references a name; the platform resolves the value.
  2. Scope each key to one job. The refund tool's key should issue refunds, not read your whole customer list. Provision a narrow credential per tool so a single leaked key has a small blast radius — the same least-privilege logic as a read-only database role.
  3. Rotate without redeploying. Keys get exposed: a departing contractor, a logged header, a vendor breach. If rotating means a code change and a deploy, you will do it late. Keep the secret a value you can swap in seconds while the tool keeps reading the current one.
  4. Log the use, not the value. You want a record that issue_refund ran with these arguments and cost this much. You never want the key in that record. Redact credentials at the boundary so they cannot reach your logs, your traces, or a teammate's screen.

The detail most tutorials skip

Ask where the key actually lives and most guides go quiet. If your framework expects STRIPE_SECRET_KEY in an environment variable sitting next to the model loop, you have already coupled your credentials to your inference. Prying them back apart later is real work.

On OBTO a tool is server-side code with its own secret store. The credential is encrypted at rest and resolved at runtime by the tool, not the model. The agent calls issue_refund; the platform injects the key, runs the call, and returns the result. The secret never travels with the conversation, never reaches the client, and never enters the model's context. Because OBTO is model-agnostic, that handling holds whether the agent runs on Claude, GPT, or whatever your team prefers — you are not re-solving secrets every time you switch models. If you are wiring your first one, the guide to building an MCP tool shows the tool shape these secrets plug into.

What you can see afterward

Because every call runs through a tool, every call leaves a trace — without the secret in it. OBTO's Glass Receipt records each one as a line you can query:

{
  "tool": "issue_refund",
  "params": { "charge_id": "ch_123", "amount": 500 },
  "secret": "STRIPE_SECRET_KEY (redacted)",
  "ms": 240,
  "cost": { "total": 0.0011 }
}

You can prove which tool used which credential, when, and to what effect — without ever storing the credential itself. That trail is the same plumbing we cover in our piece on agent observability: the system that keeps secrets safe is the one that makes their use accountable.

Doing it on OBTO

You describe the tool, ship it, and own it — the code, the secret store, and the receipt all stay on infrastructure you control, whether that is our cloud or your own cluster. And because pricing is per application rather than per seat, adding teammates to an agent that touches a paid API does not inflate the bill.

The getting-started guide takes about ten minutes, and the free Builder tier includes Glass Box tracing — enough to wire a real key into a real tool and watch the receipts come back with the secret redacted. An agent does not need to hold the keys to do the job. It just needs to know which door to ask you to open.

Frequently asked questions

Where should I store API keys for an AI agent?

Server-side, in an encrypted secret store the tool reads at runtime — never in the prompt, the client bundle, or an environment variable baked next to the model loop. The agent references the secret by name; the platform resolves the value when the tool runs.

Should an AI agent ever see its own API keys?

No. The model should decide which tool to call; your server-side runtime should hold the credential and attach it out of band. If the model never holds the key, it cannot leak it — in a prompt, a log, or a reply.

How do I stop an AI agent from leaking secrets in its logs or prompts?

Keep credentials out of the model's context entirely, and redact them at the logging boundary so headers and connection strings never reach your traces. A secret resolved server-side and stripped from logs has no path into a transcript.

How often should I rotate secrets used by AI agents?

On a regular schedule, and immediately after any suspected exposure. The practical enabler is rotation that needs no redeploy: if you can swap the stored value in seconds while the tool keeps reading the current one, you will rotate when it actually matters.

Can I give an AI agent access to a paid API without risking my bill?

Yes, by scoping the key to a narrow capability, capping what the tool can do, and logging every call with its cost. A per-call receipt lets you see spend as it happens rather than discovering it on the invoice.

Keep the keys server-side

Build a scoped tool, store its secret in the vault, and get a receipt for every call — with the credential redacted, on day one.

Get started

More from the OBTO blog