Skip to content

PydanticAI: Advanced Error Handling & Testing

Verified against pydantic-ai==1.101.0 — source modules: pydantic_ai.exceptions, pydantic_ai.usage, pydantic_ai.concurrency, pydantic_ai.models.function, pydantic_ai.agent.

PydanticAI exposes a small, predictable exception hierarchy. Understanding each exception type — when it is raised, how to catch it, and how to test that your agent handles it correctly — is essential for production-grade agents.


Exception
└── AgentRunError # base for all errors that occur during agent.run*()
├── UsageLimitExceeded # token / request / tool-call budget exhausted
├── UnexpectedModelBehavior # model produced structurally invalid output
└── ModelAPIError
└── ModelHTTPError # non-2xx HTTP from provider (has .status_code)
ModelRetry # not an error — a signal to retry (raised inside tools/validators)
UserError # programming error (bad arguments, unsupported combination)
ConcurrencyLimitExceeded # queue depth exceeded when using ConcurrencyLimiter
ApprovalRequired # tool call needs HITL approval before executing
FallbackExceptionGroup # all FallbackModel candidates failed

ModelRetry(message) is raised inside a tool or output validator to tell PydanticAI “the model’s output/args are wrong; send this feedback and retry”. It is not a real exception — it is caught by the framework before it reaches your caller.

from pydantic_ai import Agent, ModelRetry, RunContext
agent = Agent('openai:gpt-4o')
@agent.tool
async def lookup_user(ctx: RunContext[None], user_id: int) -> dict:
if user_id <= 0:
raise ModelRetry(f'user_id must be positive, got {user_id!r}')
return {'id': user_id, 'name': 'Alice'}
result = agent.run_sync('Look up user -1, then user 42.')
print(result.output)

After max_retries retries, if the model keeps producing bad args, PydanticAI raises UnexpectedModelBehavior.

from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext
class Summary(BaseModel):
title: str
bullets: list[str]
word_count: int
agent = Agent('openai:gpt-4o', output_type=Summary)
@agent.output_validator
async def validate_bullets(ctx: RunContext[None], out: Summary) -> Summary:
if len(out.bullets) < 3:
raise ModelRetry(
f'Need at least 3 bullet points, got {len(out.bullets)}. '
'Please add more detail.'
)
if out.word_count <= 0:
raise ModelRetry('word_count must be positive.')
return out
result = agent.run_sync('Summarise the Pydantic AI framework.')
print(result.output)

UnexpectedModelBehavior — model misbehaved

Section titled “UnexpectedModelBehavior — model misbehaved”

Raised when:

  • The model returns output that cannot be parsed or validated after exhausting all retries.
  • The model emits an unrecognised response structure.
  • A tool-call loop exceeds the retry budget.
from pydantic_ai import Agent, capture_run_messages
from pydantic_ai.exceptions import UnexpectedModelBehavior
from pydantic_ai.models.function import FunctionModel, AgentInfo
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
# Simulate a model that always returns invalid JSON for a structured output
def broken_model(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
return ModelResponse(parts=[TextPart(content='THIS IS NOT VALID JSON')])
from pydantic import BaseModel
class Answer(BaseModel):
value: int
agent = Agent(FunctionModel(broken_model), output_type=Answer)
with capture_run_messages() as msgs:
try:
agent.run_sync('What is 2 + 2?')
except UnexpectedModelBehavior as e:
print('Model misbehaved:', e)
print('Messages up to failure:', len(msgs))

Catching UnexpectedModelBehavior at the call site

Section titled “Catching UnexpectedModelBehavior at the call site”
import logging
from pydantic_ai import Agent
from pydantic_ai.exceptions import UnexpectedModelBehavior
logger = logging.getLogger(__name__)
agent = Agent('openai:gpt-4o')
async def safe_run(prompt: str) -> str | None:
try:
result = await agent.run(prompt)
return result.output
except UnexpectedModelBehavior as e:
logger.error('agent_misbehaved', exc_info=e, extra={'prompt': prompt[:200]})
return None # or raise a domain-specific error

UsageLimits is a dataclass you pass to agent.run*(usage_limits=...). When any limit fires, UsageLimitExceeded (a subclass of AgentRunError) is raised.

from pydantic_ai import Agent
from pydantic_ai import UsageLimits
from pydantic_ai.exceptions import UsageLimitExceeded
agent = Agent('openai:gpt-4o')
try:
result = agent.run_sync(
'Write a 10 000-word essay on the history of computing.',
usage_limits=UsageLimits(
output_tokens_limit=500, # hard cap on output tokens
request_limit=3, # max LLM round-trips
),
)
except UsageLimitExceeded as e:
print('Limit hit:', e)
from pydantic_ai import UsageLimits
limits = UsageLimits(
request_limit=50, # max number of model requests (default 50)
tool_calls_limit=20, # max successful tool executions
input_tokens_limit=8_000, # max prompt tokens per run
output_tokens_limit=2_000, # max generated tokens per run
total_tokens_limit=10_000, # combined input + output cap
count_tokens_before_request=True, # pre-check tokens (Anthropic, Google, Bedrock)
)
from pydantic_ai import Agent, RunUsage
agent = Agent('openai:gpt-4o')
result = agent.run_sync('Hello')
usage: RunUsage = result.usage()
print(f'requests={usage.requests} input={usage.input_tokens} output={usage.output_tokens} total={usage.total_tokens}')
from pydantic_ai import Agent, RunUsage
agent = Agent('openai:gpt-4o')
shared_usage = RunUsage()
for prompt in ['One', 'Two', 'Three']:
result = agent.run_sync(prompt, usage=shared_usage)
print('Grand total tokens:', shared_usage.total_tokens)

UserError means you (the developer) passed invalid arguments. It is raised at agent construction or at the start of a run, never mid-stream.

from pydantic_ai import Agent, RunContext
from pydantic_ai.exceptions import UserError
try:
agent = Agent('openai:gpt-4o', output_type=int)
@agent.output_validator
def validate_int(ctx: RunContext[None], out: int) -> int:
return out
# Overriding output_type when an output_validator is already registered
# raises UserError because the validator's type would no longer match.
agent.run_sync('hi', output_type=str)
except UserError as e:
print('Developer error:', e)

In tests, UserError is an expected result of misconfigured agents — assert on it rather than catching it silently.


ConcurrencyLimitExceeded — queue depth exceeded

Section titled “ConcurrencyLimitExceeded — queue depth exceeded”

When you use ConcurrencyLimiter(max_running=N, max_queued=M) and more than N + M tasks arrive simultaneously, ConcurrencyLimitExceeded is raised.

import asyncio
from pydantic_ai import Agent, ConcurrencyLimiter
from pydantic_ai.exceptions import ConcurrencyLimitExceeded
agent = Agent(
'openai:gpt-4o',
max_concurrency=ConcurrencyLimiter(max_running=2, max_queued=3, name='demo'),
)
async def main():
tasks = [agent.run(f'Task {i}') for i in range(10)]
results = await asyncio.gather(*tasks, return_exceptions=True)
for i, r in enumerate(results):
if isinstance(r, ConcurrencyLimitExceeded):
print(f'Task {i}: queue full, rejected')
else:
print(f'Task {i}: ok')
asyncio.run(main())

from pydantic_ai import Agent
from pydantic_ai.exceptions import ModelHTTPError, AgentRunError
agent = Agent('openai:gpt-4o')
try:
result = agent.run_sync('Hello')
except ModelHTTPError as e:
print(f'Provider returned HTTP {e.status_code}: {e}')
if e.status_code == 429:
# Rate limited — implement backoff
pass
elif e.status_code >= 500:
# Server error — retry or fail gracefully
pass
except AgentRunError as e:
print(f'Run failed: {e}')

FallbackExceptionGroup — multi-model fallback failures

Section titled “FallbackExceptionGroup — multi-model fallback failures”
from pydantic_ai import Agent
from pydantic_ai.exceptions import FallbackExceptionGroup
from pydantic_ai.models.fallback import FallbackModel
agent = Agent(
FallbackModel('openai:gpt-4o', 'anthropic:claude-opus-4-5', 'google:gemini-2.0-flash')
)
try:
result = agent.run_sync('Hello')
except FallbackExceptionGroup as eg:
print(f'All {len(eg.exceptions)} models failed:')
for exc in eg.exceptions:
print(f' {type(exc).__name__}: {exc}')

Error hooks — intercept errors without try/except

Section titled “Error hooks — intercept errors without try/except”

Use Hooks to handle errors at the capability level for cross-cutting concerns like rate-limit backoff:

from pydantic_ai import Agent
from pydantic_ai.capabilities import Hooks
from pydantic_ai.exceptions import ModelHTTPError
import time
retry_hooks = Hooks()
@retry_hooks.on.model_request_error
async def retry_on_rate_limit(ctx, *, request_context, error: Exception):
if isinstance(error, ModelHTTPError) and error.status_code == 429:
retry_after = int(getattr(error, 'retry_after', 5))
time.sleep(retry_after) # back off before raising so the agent can retry
raise error
agent = Agent('openai:gpt-4o', capabilities=[retry_hooks])

Use FunctionModel to simulate a model that first produces bad output, then good output.

import pytest
from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.models.function import FunctionModel, AgentInfo
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
from pydantic_ai.models.test import TestModel
class Price(BaseModel):
amount: float
currency: str
agent = Agent('openai:gpt-4o', output_type=Price)
@agent.output_validator
async def no_negative(ctx: RunContext[None], p: Price) -> Price:
if p.amount < 0:
raise ModelRetry('Price cannot be negative, please fix.')
return p
def test_validator_retries():
call_count = 0
def model_fn(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
nonlocal call_count
call_count += 1
if call_count == 1:
# Simulate bad output on first attempt
return ModelResponse(parts=[TextPart('{"amount": -5, "currency": "USD"}')])
# Good output on retry
return ModelResponse(parts=[TextPart('{"amount": 9.99, "currency": "USD"}')])
with agent.override(model=FunctionModel(model_fn)):
result = agent.run_sync('Price of widget?')
assert result.output.amount == 9.99
assert call_count == 2 # one retry happened
from pydantic_ai import Agent, UsageLimits
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.models.test import TestModel
agent = Agent('openai:gpt-4o')
def test_request_limit():
with agent.override(model=TestModel()):
with pytest.raises(UsageLimitExceeded, match='request'):
agent.run_sync('hi', usage_limits=UsageLimits(request_limit=0))

3. Verify UnexpectedModelBehavior is raised when retries are exhausted

Section titled “3. Verify UnexpectedModelBehavior is raised when retries are exhausted”
from pydantic_ai.exceptions import UnexpectedModelBehavior
from pydantic_ai.models.test import TestModel
agent = Agent('openai:gpt-4o')
@agent.tool(retries=2)
def always_fails(ctx: RunContext[None]) -> str:
raise ModelRetry('always broken')
def test_exhausted_retries():
with agent.override(model=TestModel()):
with pytest.raises(UnexpectedModelBehavior):
agent.run_sync('go')

4. Inspect messages after a failure with capture_run_messages

Section titled “4. Inspect messages after a failure with capture_run_messages”
from pydantic_ai import Agent, capture_run_messages
from pydantic_ai.exceptions import UnexpectedModelBehavior
from pydantic_ai.models.test import TestModel
from pydantic import BaseModel
class Strict(BaseModel):
required_field: str
agent = Agent('openai:gpt-4o', output_type=Strict, retries=1)
def test_messages_on_failure():
with capture_run_messages() as messages:
with pytest.raises(UnexpectedModelBehavior):
with agent.override(model=TestModel(custom_output_text='not json')):
agent.run_sync('go')
# Messages contain the full conversation up to the point of failure
assert len(messages) > 0
print('Conversation had', len(messages), 'messages before failure')

5. Inject specific exceptions via FunctionModel

Section titled “5. Inject specific exceptions via FunctionModel”
import pytest
from pydantic_ai import Agent
from pydantic_ai.exceptions import UnexpectedModelBehavior
from pydantic_ai.models.function import FunctionModel, AgentInfo
from pydantic_ai.messages import ModelMessage, ModelResponse
def model_that_raises(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
# Simulate provider returning an unexpected response
return ModelResponse(parts=[]) # empty parts triggers UnexpectedModelBehavior
agent = Agent(FunctionModel(model_that_raises))
def test_empty_response():
with pytest.raises(UnexpectedModelBehavior):
agent.run_sync('anything')

TestModel never calls a real LLM. Configure it with canned outputs and inspect every call it received:

import pytest
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
class Answer(BaseModel):
value: int
reasoning: str
agent = Agent('openai:gpt-4o', output_type=Answer)
def test_structured_output():
with agent.override(model=TestModel()):
result = agent.run_sync('What is 2 + 2?')
assert isinstance(result.output, Answer)
assert isinstance(result.output.value, int)
from pydantic_ai.models.test import TestModel
# Return a specific text response
model = TestModel(custom_result_text='The answer is 42.')
# Return structured output as a dict
model = TestModel(custom_result_args={'value': 42, 'reasoning': 'Basic arithmetic'})
# Simulate a tool call before responding
model = TestModel(
call_tools=['web_search'], # tool names to call
custom_result_text='Found it.',
)
from pydantic_ai.models.test import TestModel
model = TestModel()
agent = Agent('openai:gpt-4o', output_type=str)
with agent.override(model=model):
result = agent.run_sync('Hello')
print(model.agent_model_requests) # list of ModelRequest objects
print(model.agent_model_responses) # list of ModelResponse objects

FunctionModel — Python function as model

Section titled “FunctionModel — Python function as model”
from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
from pydantic_ai.models.function import FunctionModel, AgentInfo
def my_model(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
last_user = next(
(p.content for m in reversed(messages)
for p in m.parts if hasattr(p, 'content') and isinstance(p.content, str)),
'no message'
)
return ModelResponse(parts=[TextPart(content=f'Echo: {last_user}')])
agent = Agent('openai:gpt-4o')
def test_echo_response():
with agent.override(model=FunctionModel(my_model)):
result = agent.run_sync('Hello there')
assert result.output == 'Echo: Hello there'
from pydantic_ai.messages import ModelResponse, ToolCallPart
from pydantic_ai.models.function import FunctionModel, AgentInfo
import json
def tool_calling_model(messages, info: AgentInfo) -> ModelResponse:
if len(messages) == 1:
# First turn — call a tool
return ModelResponse(parts=[
ToolCallPart(
tool_name='get_weather',
args=json.dumps({'city': 'Paris'}),
tool_call_id='call_1',
)
])
# Second turn — respond with text
return ModelResponse(parts=[TextPart(content='The weather in Paris is sunny.')])

capture_run_messages — inspect all messages after a run

Section titled “capture_run_messages — inspect all messages after a run”
from pydantic_ai import Agent, capture_run_messages
agent = Agent('openai:gpt-4o')
with capture_run_messages() as messages:
result = agent.run_sync('What is 2 + 2?')
for msg in messages:
print(type(msg).__name__, '', [type(p).__name__ for p in msg.parts])

import asyncio
import httpx
from pydantic_ai import Agent, ModelRetry, RunContext
agent = Agent('openai:gpt-4o')
@agent.tool(retries=4)
async def fetch_api(ctx: RunContext[None], url: str) -> str:
delay = 2 ** ctx.retry # 1s, 2s, 4s, 8s
try:
async with httpx.AsyncClient() as client:
r = await client.get(url, timeout=10.0)
r.raise_for_status()
return r.text[:2000]
except (httpx.HTTPStatusError, httpx.TimeoutException) as e:
await asyncio.sleep(delay)
raise ModelRetry(f'Request failed ({e}); retrying (attempt {ctx.retry + 1})')
from pydantic_ai import Agent, ModelRetry, RunContext
import json
agent = Agent('openai:gpt-4o')
@agent.tool(retries=3)
def parse_structured(ctx: RunContext[None], raw: str) -> dict:
try:
data = json.loads(raw)
except json.JSONDecodeError as e:
raise ModelRetry(
f'Invalid JSON at position {e.pos}: {e.msg}. '
'Return only valid JSON, no markdown fences.'
)
if 'id' not in data:
raise ModelRetry('Response JSON must include an "id" field.')
return data

Use a Pydantic model for the error surface so callers get typed information:

from dataclasses import dataclass
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.exceptions import AgentRunError, UsageLimitExceeded, UnexpectedModelBehavior
class RunSuccess(BaseModel):
output: str
tokens_used: int
@dataclass
class RunFailure:
reason: str
kind: str
async def run_with_report(agent: Agent, prompt: str) -> RunSuccess | RunFailure:
try:
result = await agent.run(prompt)
return RunSuccess(
output=result.output,
tokens_used=result.usage().total_tokens,
)
except UsageLimitExceeded as e:
return RunFailure(reason=str(e), kind='budget_exceeded')
except UnexpectedModelBehavior as e:
return RunFailure(reason=str(e), kind='model_error')
except AgentRunError as e:
return RunFailure(reason=str(e), kind='run_error')

pytest fixtures — agents pre-wired for error testing

Section titled “pytest fixtures — agents pre-wired for error testing”
import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
from pydantic_ai.models.function import FunctionModel, AgentInfo
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
@pytest.fixture
def deterministic_agent():
agent = Agent('openai:gpt-4o', system_prompt='Be helpful.')
with agent.override(model=TestModel(seed=42)):
yield agent
@pytest.fixture
def failing_agent():
agent = Agent('openai:gpt-4o')
def always_empty(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
return ModelResponse(parts=[TextPart('')])
with agent.override(model=FunctionModel(always_empty)):
yield agent
@pytest.fixture
def test_model():
return TestModel(custom_result_text='mocked response')
def test_agent_returns_string(test_model):
agent = Agent('openai:gpt-4o', output_type=str)
with agent.override(model=test_model):
result = agent.run_sync('Any prompt')
assert isinstance(result.output, str)
import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
@pytest.mark.parametrize('prompt,expected', [
('Say hello to the world', 'Hello, world!'),
('Say goodbye to the world', 'Goodbye, world!'),
])
def test_snapshot(prompt, expected):
agent = Agent('openai:gpt-4o')
with agent.override(model=TestModel(custom_result_text=expected)):
result = agent.run_sync(prompt)
assert result.output == expected

SymbolModuleNotes
ModelRetrypydantic_ai.exceptionsTriggers a model retry from tool/validator
UnexpectedModelBehaviorpydantic_ai.exceptionsTerminal failure after retries exhausted
AgentRunErrorpydantic_ai.exceptionsBase for all run-time failures
ModelHTTPErrorpydantic_ai.exceptionsNon-2xx HTTP — has .status_code
UsageLimitExceededpydantic_ai.exceptionsUsage budget exceeded
UserErrorpydantic_ai.exceptionsDeveloper configuration error
ConcurrencyLimitExceededpydantic_ai.exceptionsQueue depth exceeded
FallbackExceptionGrouppydantic_ai.exceptionsAll fallback candidates failed
ApprovalRequiredpydantic_ai.exceptionsHITL approval needed
UsageLimitspydantic_aiBudget configuration dataclass
RunUsagepydantic_aiToken/request counters
TestModelpydantic_ai.models.testZero-LLM test double
FunctionModelpydantic_ai.models.functionPython function as model
capture_run_messagespydantic_aiCollect messages from a run
agent.override()pydantic_ai.agent.AgentContext-manager model swap