PydanticAI: Advanced Error Handling & Testing

Advanced Error Handling & Testing

Verified against pydantic-ai==1.101.0 — source modules: pydantic_ai.exceptions, pydantic_ai.usage, pydantic_ai.concurrency, pydantic_ai.models.function, pydantic_ai.agent.

PydanticAI exposes a small, predictable exception hierarchy. Understanding each exception type — when it is raised, how to catch it, and how to test that your agent handles it correctly — is essential for production-grade agents.

Exception hierarchy

Exception
└── AgentRunError            # base for all errors that occur during agent.run*()
    ├── UsageLimitExceeded   # token / request / tool-call budget exhausted
    ├── UnexpectedModelBehavior   # model produced structurally invalid output
    └── ModelAPIError
        └── ModelHTTPError   # non-2xx HTTP from provider (has .status_code)
ModelRetry                   # not an error — a signal to retry (raised inside tools/validators)
UserError                    # programming error (bad arguments, unsupported combination)
ConcurrencyLimitExceeded     # queue depth exceeded when using ConcurrencyLimiter
ApprovalRequired             # tool call needs HITL approval before executing
FallbackExceptionGroup       # all FallbackModel candidates failed

`ModelRetry` — ask the model to try again

ModelRetry(message) is raised inside a tool or output validator to tell PydanticAI “the model’s output/args are wrong; send this feedback and retry”. It is not a real exception — it is caught by the framework before it reaches your caller.

from pydantic_ai import Agent, ModelRetry, RunContext

agent = Agent('openai:gpt-4o')

@agent.tool
async def lookup_user(ctx: RunContext[None], user_id: int) -> dict:
    if user_id <= 0:
        raise ModelRetry(f'user_id must be positive, got {user_id!r}')
    return {'id': user_id, 'name': 'Alice'}

result = agent.run_sync('Look up user -1, then user 42.')
print(result.output)

After max_retries retries, if the model keeps producing bad args, PydanticAI raises UnexpectedModelBehavior.

`ModelRetry` in output validators

from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext

class Summary(BaseModel):
    title: str
    bullets: list[str]
    word_count: int

agent = Agent('openai:gpt-4o', output_type=Summary)

@agent.output_validator
async def validate_bullets(ctx: RunContext[None], out: Summary) -> Summary:
    if len(out.bullets) < 3:
        raise ModelRetry(
            f'Need at least 3 bullet points, got {len(out.bullets)}. '
            'Please add more detail.'
        )
    if out.word_count <= 0:
        raise ModelRetry('word_count must be positive.')
    return out

result = agent.run_sync('Summarise the Pydantic AI framework.')
print(result.output)

`UnexpectedModelBehavior` — model misbehaved

Raised when:

The model returns output that cannot be parsed or validated after exhausting all retries.
The model emits an unrecognised response structure.
A tool-call loop exceeds the retry budget.

from pydantic_ai import Agent, capture_run_messages
from pydantic_ai.exceptions import UnexpectedModelBehavior
from pydantic_ai.models.function import FunctionModel, AgentInfo
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart

# Simulate a model that always returns invalid JSON for a structured output
def broken_model(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
    return ModelResponse(parts=[TextPart(content='THIS IS NOT VALID JSON')])

from pydantic import BaseModel

class Answer(BaseModel):
    value: int

agent = Agent(FunctionModel(broken_model), output_type=Answer)

with capture_run_messages() as msgs:
    try:
        agent.run_sync('What is 2 + 2?')
    except UnexpectedModelBehavior as e:
        print('Model misbehaved:', e)
        print('Messages up to failure:', len(msgs))

Catching `UnexpectedModelBehavior` at the call site

import logging
from pydantic_ai import Agent
from pydantic_ai.exceptions import UnexpectedModelBehavior

logger = logging.getLogger(__name__)
agent = Agent('openai:gpt-4o')

async def safe_run(prompt: str) -> str | None:
    try:
        result = await agent.run(prompt)
        return result.output
    except UnexpectedModelBehavior as e:
        logger.error('agent_misbehaved', exc_info=e, extra={'prompt': prompt[:200]})
        return None   # or raise a domain-specific error

`UsageLimitExceeded` — budget controls

UsageLimits is a dataclass you pass to agent.run*(usage_limits=...). When any limit fires, UsageLimitExceeded (a subclass of AgentRunError) is raised.

from pydantic_ai import Agent
from pydantic_ai import UsageLimits
from pydantic_ai.exceptions import UsageLimitExceeded

agent = Agent('openai:gpt-4o')

try:
    result = agent.run_sync(
        'Write a 10 000-word essay on the history of computing.',
        usage_limits=UsageLimits(
            output_tokens_limit=500,    # hard cap on output tokens
            request_limit=3,            # max LLM round-trips
        ),
    )
except UsageLimitExceeded as e:
    print('Limit hit:', e)

All `UsageLimits` fields

from pydantic_ai import UsageLimits

limits = UsageLimits(
    request_limit=50,           # max number of model requests (default 50)
    tool_calls_limit=20,        # max successful tool executions
    input_tokens_limit=8_000,   # max prompt tokens per run
    output_tokens_limit=2_000,  # max generated tokens per run
    total_tokens_limit=10_000,  # combined input + output cap
    count_tokens_before_request=True,  # pre-check tokens (Anthropic, Google, Bedrock)
)

Tracking usage after a successful run

from pydantic_ai import Agent, RunUsage

agent = Agent('openai:gpt-4o')
result = agent.run_sync('Hello')

usage: RunUsage = result.usage()
print(f'requests={usage.requests}  input={usage.input_tokens}  output={usage.output_tokens}  total={usage.total_tokens}')

Accumulating usage across multiple runs

from pydantic_ai import Agent, RunUsage

agent = Agent('openai:gpt-4o')
shared_usage = RunUsage()

for prompt in ['One', 'Two', 'Three']:
    result = agent.run_sync(prompt, usage=shared_usage)

print('Grand total tokens:', shared_usage.total_tokens)

`UserError` — programming mistakes

UserError means you (the developer) passed invalid arguments. It is raised at agent construction or at the start of a run, never mid-stream.

from pydantic_ai import Agent, RunContext
from pydantic_ai.exceptions import UserError

try:
    agent = Agent('openai:gpt-4o', output_type=int)

    @agent.output_validator
    def validate_int(ctx: RunContext[None], out: int) -> int:
        return out

    # Overriding output_type when an output_validator is already registered
    # raises UserError because the validator's type would no longer match.
    agent.run_sync('hi', output_type=str)
except UserError as e:
    print('Developer error:', e)

In tests, UserError is an expected result of misconfigured agents — assert on it rather than catching it silently.

`ConcurrencyLimitExceeded` — queue depth exceeded

When you use ConcurrencyLimiter(max_running=N, max_queued=M) and more than N + M tasks arrive simultaneously, ConcurrencyLimitExceeded is raised.

import asyncio
from pydantic_ai import Agent, ConcurrencyLimiter
from pydantic_ai.exceptions import ConcurrencyLimitExceeded

agent = Agent(
    'openai:gpt-4o',
    max_concurrency=ConcurrencyLimiter(max_running=2, max_queued=3, name='demo'),
)

async def main():
    tasks = [agent.run(f'Task {i}') for i in range(10)]
    results = await asyncio.gather(*tasks, return_exceptions=True)
    for i, r in enumerate(results):
        if isinstance(r, ConcurrencyLimitExceeded):
            print(f'Task {i}: queue full, rejected')
        else:
            print(f'Task {i}: ok')

asyncio.run(main())

`ModelHTTPError` — provider HTTP failures

from pydantic_ai import Agent
from pydantic_ai.exceptions import ModelHTTPError, AgentRunError

agent = Agent('openai:gpt-4o')

try:
    result = agent.run_sync('Hello')
except ModelHTTPError as e:
    print(f'Provider returned HTTP {e.status_code}: {e}')
    if e.status_code == 429:
        # Rate limited — implement backoff
        pass
    elif e.status_code >= 500:
        # Server error — retry or fail gracefully
        pass
except AgentRunError as e:
    print(f'Run failed: {e}')

`FallbackExceptionGroup` — multi-model fallback failures

from pydantic_ai import Agent
from pydantic_ai.exceptions import FallbackExceptionGroup
from pydantic_ai.models.fallback import FallbackModel

agent = Agent(
    FallbackModel('openai:gpt-4o', 'anthropic:claude-opus-4-5', 'google:gemini-2.0-flash')
)

try:
    result = agent.run_sync('Hello')
except FallbackExceptionGroup as eg:
    print(f'All {len(eg.exceptions)} models failed:')
    for exc in eg.exceptions:
        print(f'  {type(exc).__name__}: {exc}')

Error hooks — intercept errors without try/except

Use Hooks to handle errors at the capability level for cross-cutting concerns like rate-limit backoff:

from pydantic_ai import Agent
from pydantic_ai.capabilities import Hooks
from pydantic_ai.exceptions import ModelHTTPError
import time

retry_hooks = Hooks()

@retry_hooks.on.model_request_error
async def retry_on_rate_limit(ctx, *, request_context, error: Exception):
    if isinstance(error, ModelHTTPError) and error.status_code == 429:
        retry_after = int(getattr(error, 'retry_after', 5))
        time.sleep(retry_after)   # back off before raising so the agent can retry
    raise error

agent = Agent('openai:gpt-4o', capabilities=[retry_hooks])

Testing error paths

1. Test that `ModelRetry` triggers properly

Use FunctionModel to simulate a model that first produces bad output, then good output.

import pytest
from pydantic import BaseModel
from pydantic_ai import Agent, ModelRetry, RunContext
from pydantic_ai.models.function import FunctionModel, AgentInfo
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
from pydantic_ai.models.test import TestModel

class Price(BaseModel):
    amount: float
    currency: str

agent = Agent('openai:gpt-4o', output_type=Price)

@agent.output_validator
async def no_negative(ctx: RunContext[None], p: Price) -> Price:
    if p.amount < 0:
        raise ModelRetry('Price cannot be negative, please fix.')
    return p

def test_validator_retries():
    call_count = 0

    def model_fn(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
        nonlocal call_count
        call_count += 1
        if call_count == 1:
            # Simulate bad output on first attempt
            return ModelResponse(parts=[TextPart('{"amount": -5, "currency": "USD"}')])
        # Good output on retry
        return ModelResponse(parts=[TextPart('{"amount": 9.99, "currency": "USD"}')])

    with agent.override(model=FunctionModel(model_fn)):
        result = agent.run_sync('Price of widget?')

    assert result.output.amount == 9.99
    assert call_count == 2  # one retry happened

2. Assert `UsageLimitExceeded` is raised

from pydantic_ai import Agent, UsageLimits
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.models.test import TestModel

agent = Agent('openai:gpt-4o')

def test_request_limit():
    with agent.override(model=TestModel()):
        with pytest.raises(UsageLimitExceeded, match='request'):
            agent.run_sync('hi', usage_limits=UsageLimits(request_limit=0))

3. Verify `UnexpectedModelBehavior` is raised when retries are exhausted

from pydantic_ai.exceptions import UnexpectedModelBehavior
from pydantic_ai.models.test import TestModel

agent = Agent('openai:gpt-4o')

@agent.tool(retries=2)
def always_fails(ctx: RunContext[None]) -> str:
    raise ModelRetry('always broken')

def test_exhausted_retries():
    with agent.override(model=TestModel()):
        with pytest.raises(UnexpectedModelBehavior):
            agent.run_sync('go')

4. Inspect messages after a failure with `capture_run_messages`

from pydantic_ai import Agent, capture_run_messages
from pydantic_ai.exceptions import UnexpectedModelBehavior
from pydantic_ai.models.test import TestModel
from pydantic import BaseModel

class Strict(BaseModel):
    required_field: str

agent = Agent('openai:gpt-4o', output_type=Strict, retries=1)

def test_messages_on_failure():
    with capture_run_messages() as messages:
        with pytest.raises(UnexpectedModelBehavior):
            with agent.override(model=TestModel(custom_output_text='not json')):
                agent.run_sync('go')

    # Messages contain the full conversation up to the point of failure
    assert len(messages) > 0
    print('Conversation had', len(messages), 'messages before failure')

5. Inject specific exceptions via `FunctionModel`

import pytest
from pydantic_ai import Agent
from pydantic_ai.exceptions import UnexpectedModelBehavior
from pydantic_ai.models.function import FunctionModel, AgentInfo
from pydantic_ai.messages import ModelMessage, ModelResponse

def model_that_raises(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
    # Simulate provider returning an unexpected response
    return ModelResponse(parts=[])   # empty parts triggers UnexpectedModelBehavior

agent = Agent(FunctionModel(model_that_raises))

def test_empty_response():
    with pytest.raises(UnexpectedModelBehavior):
        agent.run_sync('anything')

`TestModel` — zero-LLM test double

TestModel never calls a real LLM. Configure it with canned outputs and inspect every call it received:

import pytest
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

class Answer(BaseModel):
    value: int
    reasoning: str

agent = Agent('openai:gpt-4o', output_type=Answer)

def test_structured_output():
    with agent.override(model=TestModel()):
        result = agent.run_sync('What is 2 + 2?')
        assert isinstance(result.output, Answer)
        assert isinstance(result.output.value, int)

Configuring `TestModel` responses

from pydantic_ai.models.test import TestModel

# Return a specific text response
model = TestModel(custom_result_text='The answer is 42.')

# Return structured output as a dict
model = TestModel(custom_result_args={'value': 42, 'reasoning': 'Basic arithmetic'})

# Simulate a tool call before responding
model = TestModel(
    call_tools=['web_search'],    # tool names to call
    custom_result_text='Found it.',
)

Inspecting calls made to `TestModel`

from pydantic_ai.models.test import TestModel

model = TestModel()
agent = Agent('openai:gpt-4o', output_type=str)

with agent.override(model=model):
    result = agent.run_sync('Hello')

print(model.agent_model_requests)   # list of ModelRequest objects
print(model.agent_model_responses)  # list of ModelResponse objects

`FunctionModel` — Python function as model

from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
from pydantic_ai.models.function import FunctionModel, AgentInfo

def my_model(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
    last_user = next(
        (p.content for m in reversed(messages)
         for p in m.parts if hasattr(p, 'content') and isinstance(p.content, str)),
        'no message'
    )
    return ModelResponse(parts=[TextPart(content=f'Echo: {last_user}')])

agent = Agent('openai:gpt-4o')

def test_echo_response():
    with agent.override(model=FunctionModel(my_model)):
        result = agent.run_sync('Hello there')
        assert result.output == 'Echo: Hello there'

Simulating tool calls in `FunctionModel`

from pydantic_ai.messages import ModelResponse, ToolCallPart
from pydantic_ai.models.function import FunctionModel, AgentInfo
import json

def tool_calling_model(messages, info: AgentInfo) -> ModelResponse:
    if len(messages) == 1:
        # First turn — call a tool
        return ModelResponse(parts=[
            ToolCallPart(
                tool_name='get_weather',
                args=json.dumps({'city': 'Paris'}),
                tool_call_id='call_1',
            )
        ])
    # Second turn — respond with text
    return ModelResponse(parts=[TextPart(content='The weather in Paris is sunny.')])

`capture_run_messages` — inspect all messages after a run

from pydantic_ai import Agent, capture_run_messages

agent = Agent('openai:gpt-4o')

with capture_run_messages() as messages:
    result = agent.run_sync('What is 2 + 2?')

for msg in messages:
    print(type(msg).__name__, '—', [type(p).__name__ for p in msg.parts])

Retry patterns

Tool with exponential backoff

import asyncio
import httpx
from pydantic_ai import Agent, ModelRetry, RunContext

agent = Agent('openai:gpt-4o')

@agent.tool(retries=4)
async def fetch_api(ctx: RunContext[None], url: str) -> str:
    delay = 2 ** ctx.retry  # 1s, 2s, 4s, 8s
    try:
        async with httpx.AsyncClient() as client:
            r = await client.get(url, timeout=10.0)
            r.raise_for_status()
            return r.text[:2000]
    except (httpx.HTTPStatusError, httpx.TimeoutException) as e:
        await asyncio.sleep(delay)
        raise ModelRetry(f'Request failed ({e}); retrying (attempt {ctx.retry + 1})')

Conditional retry based on error type

from pydantic_ai import Agent, ModelRetry, RunContext
import json

agent = Agent('openai:gpt-4o')

@agent.tool(retries=3)
def parse_structured(ctx: RunContext[None], raw: str) -> dict:
    try:
        data = json.loads(raw)
    except json.JSONDecodeError as e:
        raise ModelRetry(
            f'Invalid JSON at position {e.pos}: {e.msg}. '
            'Return only valid JSON, no markdown fences.'
        )
    if 'id' not in data:
        raise ModelRetry('Response JSON must include an "id" field.')
    return data

Structured error reporting

Use a Pydantic model for the error surface so callers get typed information:

from dataclasses import dataclass
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.exceptions import AgentRunError, UsageLimitExceeded, UnexpectedModelBehavior

class RunSuccess(BaseModel):
    output: str
    tokens_used: int

@dataclass
class RunFailure:
    reason: str
    kind: str

async def run_with_report(agent: Agent, prompt: str) -> RunSuccess | RunFailure:
    try:
        result = await agent.run(prompt)
        return RunSuccess(
            output=result.output,
            tokens_used=result.usage().total_tokens,
        )
    except UsageLimitExceeded as e:
        return RunFailure(reason=str(e), kind='budget_exceeded')
    except UnexpectedModelBehavior as e:
        return RunFailure(reason=str(e), kind='model_error')
    except AgentRunError as e:
        return RunFailure(reason=str(e), kind='run_error')

pytest fixtures — agents pre-wired for error testing

import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
from pydantic_ai.models.function import FunctionModel, AgentInfo
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart

@pytest.fixture
def deterministic_agent():
    agent = Agent('openai:gpt-4o', system_prompt='Be helpful.')
    with agent.override(model=TestModel(seed=42)):
        yield agent

@pytest.fixture
def failing_agent():
    agent = Agent('openai:gpt-4o')
    def always_empty(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
        return ModelResponse(parts=[TextPart('')])
    with agent.override(model=FunctionModel(always_empty)):
        yield agent

@pytest.fixture
def test_model():
    return TestModel(custom_result_text='mocked response')

def test_agent_returns_string(test_model):
    agent = Agent('openai:gpt-4o', output_type=str)
    with agent.override(model=test_model):
        result = agent.run_sync('Any prompt')
        assert isinstance(result.output, str)

Parametrised snapshot testing

import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

@pytest.mark.parametrize('prompt,expected', [
    ('Say hello to the world', 'Hello, world!'),
    ('Say goodbye to the world', 'Goodbye, world!'),
])
def test_snapshot(prompt, expected):
    agent = Agent('openai:gpt-4o')
    with agent.override(model=TestModel(custom_result_text=expected)):
        result = agent.run_sync(prompt)
        assert result.output == expected

Reference

Symbol	Module	Notes
`ModelRetry`	`pydantic_ai.exceptions`	Triggers a model retry from tool/validator
`UnexpectedModelBehavior`	`pydantic_ai.exceptions`	Terminal failure after retries exhausted
`AgentRunError`	`pydantic_ai.exceptions`	Base for all run-time failures
`ModelHTTPError`	`pydantic_ai.exceptions`	Non-2xx HTTP — has `.status_code`
`UsageLimitExceeded`	`pydantic_ai.exceptions`	Usage budget exceeded
`UserError`	`pydantic_ai.exceptions`	Developer configuration error
`ConcurrencyLimitExceeded`	`pydantic_ai.exceptions`	Queue depth exceeded
`FallbackExceptionGroup`	`pydantic_ai.exceptions`	All fallback candidates failed
`ApprovalRequired`	`pydantic_ai.exceptions`	HITL approval needed
`UsageLimits`	`pydantic_ai`	Budget configuration dataclass
`RunUsage`	`pydantic_ai`	Token/request counters
`TestModel`	`pydantic_ai.models.test`	Zero-LLM test double
`FunctionModel`	`pydantic_ai.models.function`	Python function as model
`capture_run_messages`	`pydantic_ai`	Collect messages from a run
`agent.override()`	`pydantic_ai.agent.Agent`	Context-manager model swap

PydanticAI: Advanced Error Handling & Testing

Advanced Error Handling & Testing

Exception hierarchy

ModelRetry — ask the model to try again

ModelRetry in output validators

UnexpectedModelBehavior — model misbehaved

Catching UnexpectedModelBehavior at the call site

UsageLimitExceeded — budget controls

All UsageLimits fields

Tracking usage after a successful run

Accumulating usage across multiple runs

UserError — programming mistakes

ConcurrencyLimitExceeded — queue depth exceeded

ModelHTTPError — provider HTTP failures

FallbackExceptionGroup — multi-model fallback failures