PydanticAI: Advanced Error Handling & Testing
Advanced Error Handling & Testing
Section titled “Advanced Error Handling & Testing”Verified against pydantic-ai==1.101.0 — source modules: pydantic_ai.exceptions, pydantic_ai.usage, pydantic_ai.concurrency, pydantic_ai.models.function, pydantic_ai.agent.
PydanticAI exposes a small, predictable exception hierarchy. Understanding each exception type — when it is raised, how to catch it, and how to test that your agent handles it correctly — is essential for production-grade agents.
Exception hierarchy
Section titled “Exception hierarchy”Exception└── AgentRunError # base for all errors that occur during agent.run*() ├── UsageLimitExceeded # token / request / tool-call budget exhausted ├── UnexpectedModelBehavior # model produced structurally invalid output └── ModelAPIError └── ModelHTTPError # non-2xx HTTP from provider (has .status_code)ModelRetry # not an error — a signal to retry (raised inside tools/validators)UserError # programming error (bad arguments, unsupported combination)ConcurrencyLimitExceeded # queue depth exceeded when using ConcurrencyLimiterApprovalRequired # tool call needs HITL approval before executingFallbackExceptionGroup # all FallbackModel candidates failedModelRetry — ask the model to try again
Section titled “ModelRetry — ask the model to try again”ModelRetry(message) is raised inside a tool or output validator to tell PydanticAI “the model’s output/args are wrong; send this feedback and retry”. It is not a real exception — it is caught by the framework before it reaches your caller.
from pydantic_ai import Agent, ModelRetry, RunContext
agent = Agent('openai:gpt-4o')
@agent.toolasync def lookup_user(ctx: RunContext[None], user_id: int) -> dict: if user_id <= 0: raise ModelRetry(f'user_id must be positive, got {user_id!r}') return {'id': user_id, 'name': 'Alice'}
result = agent.run_sync('Look up user -1, then user 42.')print(result.output)After max_retries retries, if the model keeps producing bad args, PydanticAI raises UnexpectedModelBehavior.
ModelRetry in output validators
Section titled “ModelRetry in output validators”from pydantic import BaseModelfrom pydantic_ai import Agent, ModelRetry, RunContext
class Summary(BaseModel): title: str bullets: list[str] word_count: int
agent = Agent('openai:gpt-4o', output_type=Summary)
@agent.output_validatorasync def validate_bullets(ctx: RunContext[None], out: Summary) -> Summary: if len(out.bullets) < 3: raise ModelRetry( f'Need at least 3 bullet points, got {len(out.bullets)}. ' 'Please add more detail.' ) if out.word_count <= 0: raise ModelRetry('word_count must be positive.') return out
result = agent.run_sync('Summarise the Pydantic AI framework.')print(result.output)UnexpectedModelBehavior — model misbehaved
Section titled “UnexpectedModelBehavior — model misbehaved”Raised when:
- The model returns output that cannot be parsed or validated after exhausting all retries.
- The model emits an unrecognised response structure.
- A tool-call loop exceeds the retry budget.
from pydantic_ai import Agent, capture_run_messagesfrom pydantic_ai.exceptions import UnexpectedModelBehaviorfrom pydantic_ai.models.function import FunctionModel, AgentInfofrom pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
# Simulate a model that always returns invalid JSON for a structured outputdef broken_model(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: return ModelResponse(parts=[TextPart(content='THIS IS NOT VALID JSON')])
from pydantic import BaseModel
class Answer(BaseModel): value: int
agent = Agent(FunctionModel(broken_model), output_type=Answer)
with capture_run_messages() as msgs: try: agent.run_sync('What is 2 + 2?') except UnexpectedModelBehavior as e: print('Model misbehaved:', e) print('Messages up to failure:', len(msgs))Catching UnexpectedModelBehavior at the call site
Section titled “Catching UnexpectedModelBehavior at the call site”import loggingfrom pydantic_ai import Agentfrom pydantic_ai.exceptions import UnexpectedModelBehavior
logger = logging.getLogger(__name__)agent = Agent('openai:gpt-4o')
async def safe_run(prompt: str) -> str | None: try: result = await agent.run(prompt) return result.output except UnexpectedModelBehavior as e: logger.error('agent_misbehaved', exc_info=e, extra={'prompt': prompt[:200]}) return None # or raise a domain-specific errorUsageLimitExceeded — budget controls
Section titled “UsageLimitExceeded — budget controls”UsageLimits is a dataclass you pass to agent.run*(usage_limits=...). When any limit fires, UsageLimitExceeded (a subclass of AgentRunError) is raised.
from pydantic_ai import Agentfrom pydantic_ai import UsageLimitsfrom pydantic_ai.exceptions import UsageLimitExceeded
agent = Agent('openai:gpt-4o')
try: result = agent.run_sync( 'Write a 10 000-word essay on the history of computing.', usage_limits=UsageLimits( output_tokens_limit=500, # hard cap on output tokens request_limit=3, # max LLM round-trips ), )except UsageLimitExceeded as e: print('Limit hit:', e)All UsageLimits fields
Section titled “All UsageLimits fields”from pydantic_ai import UsageLimits
limits = UsageLimits( request_limit=50, # max number of model requests (default 50) tool_calls_limit=20, # max successful tool executions input_tokens_limit=8_000, # max prompt tokens per run output_tokens_limit=2_000, # max generated tokens per run total_tokens_limit=10_000, # combined input + output cap count_tokens_before_request=True, # pre-check tokens (Anthropic, Google, Bedrock))Tracking usage after a successful run
Section titled “Tracking usage after a successful run”from pydantic_ai import Agent, RunUsage
agent = Agent('openai:gpt-4o')result = agent.run_sync('Hello')
usage: RunUsage = result.usage()print(f'requests={usage.requests} input={usage.input_tokens} output={usage.output_tokens} total={usage.total_tokens}')Accumulating usage across multiple runs
Section titled “Accumulating usage across multiple runs”from pydantic_ai import Agent, RunUsage
agent = Agent('openai:gpt-4o')shared_usage = RunUsage()
for prompt in ['One', 'Two', 'Three']: result = agent.run_sync(prompt, usage=shared_usage)
print('Grand total tokens:', shared_usage.total_tokens)UserError — programming mistakes
Section titled “UserError — programming mistakes”UserError means you (the developer) passed invalid arguments. It is raised at agent construction or at the start of a run, never mid-stream.
from pydantic_ai import Agent, RunContextfrom pydantic_ai.exceptions import UserError
try: agent = Agent('openai:gpt-4o', output_type=int)
@agent.output_validator def validate_int(ctx: RunContext[None], out: int) -> int: return out
# Overriding output_type when an output_validator is already registered # raises UserError because the validator's type would no longer match. agent.run_sync('hi', output_type=str)except UserError as e: print('Developer error:', e)In tests, UserError is an expected result of misconfigured agents — assert on it rather than catching it silently.
ConcurrencyLimitExceeded — queue depth exceeded
Section titled “ConcurrencyLimitExceeded — queue depth exceeded”When you use ConcurrencyLimiter(max_running=N, max_queued=M) and more than N + M tasks arrive simultaneously, ConcurrencyLimitExceeded is raised.
import asynciofrom pydantic_ai import Agent, ConcurrencyLimiterfrom pydantic_ai.exceptions import ConcurrencyLimitExceeded
agent = Agent( 'openai:gpt-4o', max_concurrency=ConcurrencyLimiter(max_running=2, max_queued=3, name='demo'),)
async def main(): tasks = [agent.run(f'Task {i}') for i in range(10)] results = await asyncio.gather(*tasks, return_exceptions=True) for i, r in enumerate(results): if isinstance(r, ConcurrencyLimitExceeded): print(f'Task {i}: queue full, rejected') else: print(f'Task {i}: ok')
asyncio.run(main())ModelHTTPError — provider HTTP failures
Section titled “ModelHTTPError — provider HTTP failures”from pydantic_ai import Agentfrom pydantic_ai.exceptions import ModelHTTPError, AgentRunError
agent = Agent('openai:gpt-4o')
try: result = agent.run_sync('Hello')except ModelHTTPError as e: print(f'Provider returned HTTP {e.status_code}: {e}') if e.status_code == 429: # Rate limited — implement backoff pass elif e.status_code >= 500: # Server error — retry or fail gracefully passexcept AgentRunError as e: print(f'Run failed: {e}')FallbackExceptionGroup — multi-model fallback failures
Section titled “FallbackExceptionGroup — multi-model fallback failures”from pydantic_ai import Agentfrom pydantic_ai.exceptions import FallbackExceptionGroupfrom pydantic_ai.models.fallback import FallbackModel
agent = Agent( FallbackModel('openai:gpt-4o', 'anthropic:claude-opus-4-5', 'google:gemini-2.0-flash'))
try: result = agent.run_sync('Hello')except FallbackExceptionGroup as eg: print(f'All {len(eg.exceptions)} models failed:') for exc in eg.exceptions: print(f' {type(exc).__name__}: {exc}')Error hooks — intercept errors without try/except
Section titled “Error hooks — intercept errors without try/except”Use Hooks to handle errors at the capability level for cross-cutting concerns like rate-limit backoff:
from pydantic_ai import Agentfrom pydantic_ai.capabilities import Hooksfrom pydantic_ai.exceptions import ModelHTTPErrorimport time
retry_hooks = Hooks()
@retry_hooks.on.model_request_errorasync def retry_on_rate_limit(ctx, *, request_context, error: Exception): if isinstance(error, ModelHTTPError) and error.status_code == 429: retry_after = int(getattr(error, 'retry_after', 5)) time.sleep(retry_after) # back off before raising so the agent can retry raise error
agent = Agent('openai:gpt-4o', capabilities=[retry_hooks])Testing error paths
Section titled “Testing error paths”1. Test that ModelRetry triggers properly
Section titled “1. Test that ModelRetry triggers properly”Use FunctionModel to simulate a model that first produces bad output, then good output.
import pytestfrom pydantic import BaseModelfrom pydantic_ai import Agent, ModelRetry, RunContextfrom pydantic_ai.models.function import FunctionModel, AgentInfofrom pydantic_ai.messages import ModelMessage, ModelResponse, TextPartfrom pydantic_ai.models.test import TestModel
class Price(BaseModel): amount: float currency: str
agent = Agent('openai:gpt-4o', output_type=Price)
@agent.output_validatorasync def no_negative(ctx: RunContext[None], p: Price) -> Price: if p.amount < 0: raise ModelRetry('Price cannot be negative, please fix.') return p
def test_validator_retries(): call_count = 0
def model_fn(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: nonlocal call_count call_count += 1 if call_count == 1: # Simulate bad output on first attempt return ModelResponse(parts=[TextPart('{"amount": -5, "currency": "USD"}')]) # Good output on retry return ModelResponse(parts=[TextPart('{"amount": 9.99, "currency": "USD"}')])
with agent.override(model=FunctionModel(model_fn)): result = agent.run_sync('Price of widget?')
assert result.output.amount == 9.99 assert call_count == 2 # one retry happened2. Assert UsageLimitExceeded is raised
Section titled “2. Assert UsageLimitExceeded is raised”from pydantic_ai import Agent, UsageLimitsfrom pydantic_ai.exceptions import UsageLimitExceededfrom pydantic_ai.models.test import TestModel
agent = Agent('openai:gpt-4o')
def test_request_limit(): with agent.override(model=TestModel()): with pytest.raises(UsageLimitExceeded, match='request'): agent.run_sync('hi', usage_limits=UsageLimits(request_limit=0))3. Verify UnexpectedModelBehavior is raised when retries are exhausted
Section titled “3. Verify UnexpectedModelBehavior is raised when retries are exhausted”from pydantic_ai.exceptions import UnexpectedModelBehaviorfrom pydantic_ai.models.test import TestModel
agent = Agent('openai:gpt-4o')
@agent.tool(retries=2)def always_fails(ctx: RunContext[None]) -> str: raise ModelRetry('always broken')
def test_exhausted_retries(): with agent.override(model=TestModel()): with pytest.raises(UnexpectedModelBehavior): agent.run_sync('go')4. Inspect messages after a failure with capture_run_messages
Section titled “4. Inspect messages after a failure with capture_run_messages”from pydantic_ai import Agent, capture_run_messagesfrom pydantic_ai.exceptions import UnexpectedModelBehaviorfrom pydantic_ai.models.test import TestModelfrom pydantic import BaseModel
class Strict(BaseModel): required_field: str
agent = Agent('openai:gpt-4o', output_type=Strict, retries=1)
def test_messages_on_failure(): with capture_run_messages() as messages: with pytest.raises(UnexpectedModelBehavior): with agent.override(model=TestModel(custom_output_text='not json')): agent.run_sync('go')
# Messages contain the full conversation up to the point of failure assert len(messages) > 0 print('Conversation had', len(messages), 'messages before failure')5. Inject specific exceptions via FunctionModel
Section titled “5. Inject specific exceptions via FunctionModel”import pytestfrom pydantic_ai import Agentfrom pydantic_ai.exceptions import UnexpectedModelBehaviorfrom pydantic_ai.models.function import FunctionModel, AgentInfofrom pydantic_ai.messages import ModelMessage, ModelResponse
def model_that_raises(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: # Simulate provider returning an unexpected response return ModelResponse(parts=[]) # empty parts triggers UnexpectedModelBehavior
agent = Agent(FunctionModel(model_that_raises))
def test_empty_response(): with pytest.raises(UnexpectedModelBehavior): agent.run_sync('anything')TestModel — zero-LLM test double
Section titled “TestModel — zero-LLM test double”TestModel never calls a real LLM. Configure it with canned outputs and inspect every call it received:
import pytestfrom pydantic import BaseModelfrom pydantic_ai import Agentfrom pydantic_ai.models.test import TestModel
class Answer(BaseModel): value: int reasoning: str
agent = Agent('openai:gpt-4o', output_type=Answer)
def test_structured_output(): with agent.override(model=TestModel()): result = agent.run_sync('What is 2 + 2?') assert isinstance(result.output, Answer) assert isinstance(result.output.value, int)Configuring TestModel responses
Section titled “Configuring TestModel responses”from pydantic_ai.models.test import TestModel
# Return a specific text responsemodel = TestModel(custom_result_text='The answer is 42.')
# Return structured output as a dictmodel = TestModel(custom_result_args={'value': 42, 'reasoning': 'Basic arithmetic'})
# Simulate a tool call before respondingmodel = TestModel( call_tools=['web_search'], # tool names to call custom_result_text='Found it.',)Inspecting calls made to TestModel
Section titled “Inspecting calls made to TestModel”from pydantic_ai.models.test import TestModel
model = TestModel()agent = Agent('openai:gpt-4o', output_type=str)
with agent.override(model=model): result = agent.run_sync('Hello')
print(model.agent_model_requests) # list of ModelRequest objectsprint(model.agent_model_responses) # list of ModelResponse objectsFunctionModel — Python function as model
Section titled “FunctionModel — Python function as model”from pydantic_ai import Agentfrom pydantic_ai.messages import ModelMessage, ModelResponse, TextPartfrom pydantic_ai.models.function import FunctionModel, AgentInfo
def my_model(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: last_user = next( (p.content for m in reversed(messages) for p in m.parts if hasattr(p, 'content') and isinstance(p.content, str)), 'no message' ) return ModelResponse(parts=[TextPart(content=f'Echo: {last_user}')])
agent = Agent('openai:gpt-4o')
def test_echo_response(): with agent.override(model=FunctionModel(my_model)): result = agent.run_sync('Hello there') assert result.output == 'Echo: Hello there'Simulating tool calls in FunctionModel
Section titled “Simulating tool calls in FunctionModel”from pydantic_ai.messages import ModelResponse, ToolCallPartfrom pydantic_ai.models.function import FunctionModel, AgentInfoimport json
def tool_calling_model(messages, info: AgentInfo) -> ModelResponse: if len(messages) == 1: # First turn — call a tool return ModelResponse(parts=[ ToolCallPart( tool_name='get_weather', args=json.dumps({'city': 'Paris'}), tool_call_id='call_1', ) ]) # Second turn — respond with text return ModelResponse(parts=[TextPart(content='The weather in Paris is sunny.')])capture_run_messages — inspect all messages after a run
Section titled “capture_run_messages — inspect all messages after a run”from pydantic_ai import Agent, capture_run_messages
agent = Agent('openai:gpt-4o')
with capture_run_messages() as messages: result = agent.run_sync('What is 2 + 2?')
for msg in messages: print(type(msg).__name__, '—', [type(p).__name__ for p in msg.parts])Retry patterns
Section titled “Retry patterns”Tool with exponential backoff
Section titled “Tool with exponential backoff”import asyncioimport httpxfrom pydantic_ai import Agent, ModelRetry, RunContext
agent = Agent('openai:gpt-4o')
@agent.tool(retries=4)async def fetch_api(ctx: RunContext[None], url: str) -> str: delay = 2 ** ctx.retry # 1s, 2s, 4s, 8s try: async with httpx.AsyncClient() as client: r = await client.get(url, timeout=10.0) r.raise_for_status() return r.text[:2000] except (httpx.HTTPStatusError, httpx.TimeoutException) as e: await asyncio.sleep(delay) raise ModelRetry(f'Request failed ({e}); retrying (attempt {ctx.retry + 1})')Conditional retry based on error type
Section titled “Conditional retry based on error type”from pydantic_ai import Agent, ModelRetry, RunContextimport json
agent = Agent('openai:gpt-4o')
@agent.tool(retries=3)def parse_structured(ctx: RunContext[None], raw: str) -> dict: try: data = json.loads(raw) except json.JSONDecodeError as e: raise ModelRetry( f'Invalid JSON at position {e.pos}: {e.msg}. ' 'Return only valid JSON, no markdown fences.' ) if 'id' not in data: raise ModelRetry('Response JSON must include an "id" field.') return dataStructured error reporting
Section titled “Structured error reporting”Use a Pydantic model for the error surface so callers get typed information:
from dataclasses import dataclassfrom pydantic import BaseModelfrom pydantic_ai import Agentfrom pydantic_ai.exceptions import AgentRunError, UsageLimitExceeded, UnexpectedModelBehavior
class RunSuccess(BaseModel): output: str tokens_used: int
@dataclassclass RunFailure: reason: str kind: str
async def run_with_report(agent: Agent, prompt: str) -> RunSuccess | RunFailure: try: result = await agent.run(prompt) return RunSuccess( output=result.output, tokens_used=result.usage().total_tokens, ) except UsageLimitExceeded as e: return RunFailure(reason=str(e), kind='budget_exceeded') except UnexpectedModelBehavior as e: return RunFailure(reason=str(e), kind='model_error') except AgentRunError as e: return RunFailure(reason=str(e), kind='run_error')pytest fixtures — agents pre-wired for error testing
Section titled “pytest fixtures — agents pre-wired for error testing”import pytestfrom pydantic_ai import Agentfrom pydantic_ai.models.test import TestModelfrom pydantic_ai.models.function import FunctionModel, AgentInfofrom pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
@pytest.fixturedef deterministic_agent(): agent = Agent('openai:gpt-4o', system_prompt='Be helpful.') with agent.override(model=TestModel(seed=42)): yield agent
@pytest.fixturedef failing_agent(): agent = Agent('openai:gpt-4o') def always_empty(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: return ModelResponse(parts=[TextPart('')]) with agent.override(model=FunctionModel(always_empty)): yield agent
@pytest.fixturedef test_model(): return TestModel(custom_result_text='mocked response')
def test_agent_returns_string(test_model): agent = Agent('openai:gpt-4o', output_type=str) with agent.override(model=test_model): result = agent.run_sync('Any prompt') assert isinstance(result.output, str)Parametrised snapshot testing
Section titled “Parametrised snapshot testing”import pytestfrom pydantic_ai import Agentfrom pydantic_ai.models.test import TestModel
@pytest.mark.parametrize('prompt,expected', [ ('Say hello to the world', 'Hello, world!'), ('Say goodbye to the world', 'Goodbye, world!'),])def test_snapshot(prompt, expected): agent = Agent('openai:gpt-4o') with agent.override(model=TestModel(custom_result_text=expected)): result = agent.run_sync(prompt) assert result.output == expectedReference
Section titled “Reference”| Symbol | Module | Notes |
|---|---|---|
ModelRetry | pydantic_ai.exceptions | Triggers a model retry from tool/validator |
UnexpectedModelBehavior | pydantic_ai.exceptions | Terminal failure after retries exhausted |
AgentRunError | pydantic_ai.exceptions | Base for all run-time failures |
ModelHTTPError | pydantic_ai.exceptions | Non-2xx HTTP — has .status_code |
UsageLimitExceeded | pydantic_ai.exceptions | Usage budget exceeded |
UserError | pydantic_ai.exceptions | Developer configuration error |
ConcurrencyLimitExceeded | pydantic_ai.exceptions | Queue depth exceeded |
FallbackExceptionGroup | pydantic_ai.exceptions | All fallback candidates failed |
ApprovalRequired | pydantic_ai.exceptions | HITL approval needed |
UsageLimits | pydantic_ai | Budget configuration dataclass |
RunUsage | pydantic_ai | Token/request counters |
TestModel | pydantic_ai.models.test | Zero-LLM test double |
FunctionModel | pydantic_ai.models.function | Python function as model |
capture_run_messages | pydantic_ai | Collect messages from a run |
agent.override() | pydantic_ai.agent.Agent | Context-manager model swap |