LangChain & LangGraph budget control with SpendGuard¶
Your LangChain agent's tool loop just hit the retry path on a flaky OpenAI response. Each
ChatOpenAI.invoke()ships another request to the provider. Without a gate, you find out the cost on next month's invoice. SpendGuard wraps theBaseChatModelso every invocation reserves against a budget before the upstream call goes out — and the same wrapper works for LangGraph since LangGraph builds onBaseChatModel.
Why you'd want this¶
- One wrapper, both frameworks.
SpendGuardChatModelis a drop-inBaseChatModelsubclass — pass it tocreate_react_agent, anyRunnableSequence, or a custom LangGraph node without other changes. - Tool calls and structured output preserved.
bind_tools()andwith_structured_output()forward to the inner model. Pydantic-typed outputs and function-calling continue to work. - Pre-call refusal, not post-hoc accounting. Over-budget calls
raise inside
invoke()/ainvoke()so the chain halts before any token is spent. - Audit + approval pipeline shared with every other framework. The wrapper writes to the same SpendGuard ledger as the Pydantic-AI and OpenAI-Agents integrations, so a multi-framework agent fleet gets a single decision log.
Setup (60 seconds)¶
Bring up a sidecar via the demo stack:
Wire it up¶
import asyncio
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from spendguard import SpendGuardClient, new_uuid7
from spendguard.integrations.langchain import (
RunContext, SpendGuardChatModel, run_context,
)
from spendguard._proto.spendguard.common.v1 import common_pb2
async def main() -> None:
client = SpendGuardClient(
socket_path="/var/run/spendguard/adapter.sock",
tenant_id="00000000-0000-4000-8000-000000000001",
)
await client.connect()
await client.handshake()
guarded = SpendGuardChatModel(
inner=ChatOpenAI(model="gpt-4o-mini"),
client=client,
budget_id="my-budget",
window_instance_id="my-window",
unit=common_pb2.UnitRef(
unit_id="usd_micros",
token_kind="usd_micros",
model_family="gpt-4",
),
pricing=common_pb2.PricingFreeze(pricing_version="2025-q4"),
claim_estimator=lambda messages: [
common_pb2.BudgetClaim(
budget_id="my-budget",
window_instance_id="my-window",
amount_micros=1_000_000,
)
],
)
# LangGraph: works because SpendGuardChatModel forwards bind_tools()
agent = create_react_agent(guarded, tools=[my_tool])
async with run_context(RunContext(run_id=str(new_uuid7()))):
result = await agent.ainvoke({
"messages": [HumanMessage(content="Hello")]
})
print(result["messages"][-1].content)
asyncio.run(main())
What you get¶
- Pre-call budget reservation on every
ChatOpenAI.invoke()/ainvoke()/ streaming call. - Same wrapper covers LangGraph.
create_react_agent, customStateGraphnodes, and tool-loops all flow through one gate. - Structured output and tool calls preserved. Wrapper's
bind_tools()andwith_structured_output()are passthroughs.
Common patterns¶
Per-tool budget granularity¶
Configure separate budget_ids per tool. The claim_estimator
inspects the inbound messages and picks the right budget claim —
a big-context summarization tool gets a different reservation than
a calculator tool.
LangGraph subgraphs and sub-agents¶
Each subgraph that constructs its own model instance needs its own
SpendGuardChatModel wrapper. Share one SpendGuardClient across
all of them — the client is thread-safe and pools the UDS
connection.
Streaming responses¶
Streamed token deltas don't change the reservation flow; the
reservation is made before the stream opens. Use the wrapper's
astream() for async streaming.
Related¶
- Quickstart — full stack up in 5 minutes
- Contract DSL reference — author allow/stop rules
- Other integrations: Pydantic-AI · OpenAI Agents SDK · Microsoft AGT