The spend firewall for LLM agents

Reserve budget before the provider is called, sign every decision, and stop runaway agents in p50 ≤10ms.

How it works

Reserve

Atomic per-tenant ledger debit before the provider call.

Commit

Read response.usage, commit real spend, refund overshoot.

Audit

KMS-signed CloudEvent for every reserve / commit / reject.

Features

Pre-call reservations

Atomic budget debit before the provider is hit. Fail-closed when exhausted.

Signed audit

Every decision is a KMS-signed CloudEvent landing in your SIEM.

Multi-tenant isolation

Per-tenant ledgers. One runaway agent cannot drain another tenant.

Stripe-style auth/capture

Reserve the worst case, commit the real spend, refund overshoot.

p50 ≤10ms decisions

Measured per SLO contract NF1, not aspirational.

Framework-agnostic

Adapters for LiteLLM, OpenAI Agents SDK, LangChain, LangGraph, Pydantic-AI, Microsoft AGT.

Drop it into LiteLLM

import litellm
from spendguard.litellm import enforce_budget

litellm.callbacks = [enforce_budget(tenant="acme")]

response = await litellm.acompletion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "..."}],
)
# HTTP 403 BUDGET_EXHAUSTED — provider was never called

Use cases

Multi-tenant
Compliance
SLO
Audit
Cost
Egress

Deployments

Your logo
Your logo
Your logo

Working on a deployment? Tell us.