Pydantic-AI budget control with SpendGuard
Your Pydantic-AI agent calls
agent.run("...")and the run loop dispatchesModel.request()repeatedly — once per step, once per retry, once per multi-step tool loop. Without a gate, every iteration is a free shot at the provider. SpendGuard wraps the model so eachrequest()reserves against a budget before the upstream LLM call ships.
Why you’d want this
Section titled “Why you’d want this”- Pre-call enforcement, not post-hoc dashboards. Reservation
happens before the OpenAI/Anthropic call. Over-budget calls raise
DecisionStoppedand the upstream request never goes out. - Retry-safe idempotency. Pydantic-AI re-enters
request()on transient errors. SpendGuard derives a stableidempotency_keyfrom messages + settings + run_id, so the retry collapses onto the original reservation instead of allocating a new one. - Tool loops stay budgeted. Multi-step tool-using agents are gated on every model call, including steps spawned by tool output.
- Audit trail. Every decision (allow / stop / require_approval / degrade) is signed and chained for post-hoc analysis.
- Human-in-the-loop approval. Pause-and-resume with
await e.resume(client)when a contract firesREQUIRE_APPROVAL.
Setup (60 seconds)
Section titled “Setup (60 seconds)”pip install 'spendguard-sdk[pydantic-ai]'You also need a running SpendGuard sidecar reachable on a Unix Domain Socket. The fastest path is the demo stack:
git clone https://github.com/m24927605/agentic-spendguard.gitcd agentic-spendguard && make demo-upThe demo binds the sidecar UDS at deploy/demo/runtime/uds/adapter.sock.
Wire it up
Section titled “Wire it up”import asyncio
from pydantic_ai import Agentfrom pydantic_ai.models.openai import OpenAIModel
from spendguard import SpendGuardClient, new_uuid7from spendguard.integrations.pydantic_ai import ( RunContext, SpendGuardModel, run_context,)from spendguard._proto.spendguard.common.v1 import common_pb2
async def main() -> None: client = SpendGuardClient( socket_path="/var/run/spendguard/adapter.sock", tenant_id="00000000-0000-4000-8000-000000000001", ) await client.connect() await client.handshake()
guarded = SpendGuardModel( inner=OpenAIModel("gpt-4o-mini"), client=client, budget_id="my-budget", window_instance_id="my-window", unit=common_pb2.UnitRef( unit_id="usd_micros", token_kind="usd_micros", model_family="gpt-4", ), pricing=common_pb2.PricingFreeze(pricing_version="2025-q4"), claim_estimator=lambda messages, settings: [ common_pb2.BudgetClaim( budget_id="my-budget", window_instance_id="my-window", amount_micros=1_000_000, # 1 USD reservation per call ) ], )
agent = Agent(model=guarded) async with run_context(RunContext(run_id=str(new_uuid7()))): result = await agent.run("Hello") print(result.output)
asyncio.run(main())What you get
Section titled “What you get”- Pre-call budget reservation. The wrapped model raises
DecisionStoppedinstead of calling the LLM when the reservation would exceed the budget. - Signed audit chain. Every decision is recorded in the ledger
with a cryptographic signature; replay-safe via the
audit_outboxtransactional pattern. - Approval continuation. When a contract fires
REQUIRE_APPROVAL, the exception carriese.resume(client)— call it after an operator approves in the dashboard.
Common patterns
Section titled “Common patterns”Per-tenant budgets
Section titled “Per-tenant budgets”Pass distinct budget_id / window_instance_id values per tenant.
The control plane API (POST /v1/budgets) provisions budgets
without restarting the agent.
Handling approvals
Section titled “Handling approvals”from spendguard import ApprovalRequired
try: result = await agent.run(prompt)except ApprovalRequired as e: await wait_for_operator_approval(e.decision_id) result = await e.resume(client)Testing without burning tokens
Section titled “Testing without burning tokens”Replace OpenAIModel with pydantic_ai.models.test.TestModel. The
SpendGuard wrapper still records reservations and decisions, so you
can unit-test budget logic without provider keys.
Related
Section titled “Related”- Quickstart — full stack up in 5 minutes
- Contract DSL reference — author allow/stop rules
- Other integrations: LangChain & LangGraph · OpenAI Agents SDK · Microsoft AGT