Skip to content

PydanticAI: Testing with TestModel, FunctionModel & Overrides

Verified against pydantic-ai==1.85.1 — source modules: pydantic_ai.models.test, pydantic_ai.models.function, pydantic_ai.agent.

PydanticAI ships two model implementations built for tests: TestModel (auto-generates tool calls + a response from JSON schema) and FunctionModel (you write the response-generating function). Combined with agent.override(...) and capture_run_messages, you can unit-test agents hermetically — no network, no API keys, deterministic.

from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
agent = Agent('openai:gpt-5.2', system_prompt='Be terse.')
def test_greet():
with agent.override(model=TestModel()):
out = agent.run_sync('Hi')
assert isinstance(out.output, str)

agent.override(...) returns a context manager — always use with (or hold a single instance and call __enter__ / __exit__ on that same object). The override reverts on exit.

Lives at pydantic_ai.models.test.TestModel (models/test.py:60). Given the agent’s tool schemas, it:

  1. Calls every tool once (unless you restrict with call_tools=[...]).
  2. Synthesises tool arguments that match each tool’s JSON schema.
  3. Produces a final response — a string, or args that match the output tool’s schema.

Constructor (models/test.py:94):

ArgumentTypeDefaultPurpose
call_toolslist[str] | Literal['all']'all'Which tools to call. Empty list = skip tools, go straight to output.
custom_output_textstr | NoneNoneForce this string as the final text output.
custom_output_argsAny | NoneNoneForce these args for the output tool (overrides schema-gen).
seedint0Seed for the schema-driven arg generator.
model_name / profile / settingsForwarded to the base Model.

After the run, TestModel.last_model_request_parameters holds the final ModelRequestParameters — useful for asserting tools were offered in a given shape.

from pydantic_ai import Agent, RunContext
from pydantic_ai.models.test import TestModel
agent = Agent('openai:gpt-5.2')
@agent.tool
def lookup(ctx: RunContext[None], sku: str) -> dict:
return {'sku': sku, 'price': 9.99}
def test_model_calls_lookup():
tm = TestModel()
with agent.override(model=tm):
result = agent.run_sync('Price of SKU ABC?')
# TestModel invokes every tool; verify via the messages
tool_names = [
p.tool_name
for m in result.all_messages()
for p in m.parts
if getattr(p, 'part_kind', None) == 'tool-call'
]
assert 'lookup' in tool_names
tm = TestModel(custom_output_text='mocked reply')
with agent.override(model=tm):
result = agent.run_sync('ignored')
assert result.output == 'mocked reply'

For agents with output_type=MyModel, set custom_output_args to a dict matching the schema.

pydantic_ai.models.function.FunctionModel (models/function.py:45) lets you implement the model as a function. You get the full message history and metadata, and return a ModelResponse.

from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage, ModelResponse, TextPart
from pydantic_ai.models.function import AgentInfo, FunctionModel
def echo(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
last = messages[-1].parts[-1]
return ModelResponse(parts=[TextPart(content=f'echo: {last.content}')])
agent = Agent(FunctionModel(echo))
result = agent.run_sync('hello')
assert result.output == 'echo: hello'

AgentInfo (models/function.py:219) exposes what the agent decided to send this step:

  • function_tools: list[ToolDefinition]
  • output_tools: list[ToolDefinition]
  • allow_text_output: bool
  • model_settings: ModelSettings | None
  • model_request_parameters: ModelRequestParameters
  • instructions: str | None

Use FunctionModel when you need branching behaviour (e.g. “first call returns a tool call, second returns the final answer”):

from pydantic_ai.messages import ToolCallPart
def first_tool_then_answer(messages, info):
calls = [p for m in messages for p in m.parts if isinstance(p, ToolCallPart)]
if not calls:
return ModelResponse(parts=[ToolCallPart(tool_name='lookup', args={'sku': 'ABC'})])
return ModelResponse(parts=[TextPart('done')])

Pass stream_function= as well (either alone or with function=). The stream function is an async generator yielding DeltaToolCalls or strings — see models/function.py for the exact signature if you need true streaming tests.

Agent.override(...) — swap parts per run/test

Section titled “Agent.override(...) — swap parts per run/test”

agent/__init__.py:1639. Temporarily replaces any of:

  • model (Model, a KnownModelName, or any str)
  • deps
  • toolsets, tools
  • instructions
  • model_settings, metadata, name

Returns a context manager — everything reverts on exit. Overrides are captured in contextvars, so they are safe under asyncio concurrency (each task sees its own overrides).

with agent.override(model=TestModel(), deps=FakeDB()):
result = agent.run_sync('query')
import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
@pytest.fixture
def test_agent(my_agent: Agent):
with my_agent.override(model=TestModel()):
yield my_agent
from pydantic_ai import Agent, capture_run_messages
from pydantic_ai.exceptions import UnexpectedModelBehavior
def test_bad_output_surfaces_messages():
agent = Agent(..., output_type=StrictSchema)
with capture_run_messages() as msgs:
with pytest.raises(UnexpectedModelBehavior):
agent.run_sync('...')
# Inspect what the model actually produced
assert any('tool-call' == getattr(p, 'part_kind', None) for m in msgs for p in m.parts)

Only the first run* call inside the with is captured — don’t loop runs inside one context.

from pydantic_ai.messages import ModelMessagesTypeAdapter
def test_tool_flow(snapshot):
with agent.override(model=TestModel()):
result = agent.run_sync('go')
snapshot.assert_match(
ModelMessagesTypeAdapter.dump_json(result.all_messages(), indent=2),
'tool_flow.json',
)

Pair with syrupy or pytest-snapshot.

FeatureTestModelFunctionModel
Auto-generates argsyes (from JSON schema + seed)no — you build the ModelResponse
Calls every toolyes (unless call_tools=[...])only if your function emits ToolCallPart
Deterministicyesyes
Good forsmoke tests, end-to-end schema checkprotocol tests, multi-step scenarios, error injection
Streamingyes (via TestStreamedResponse)yes (pass stream_function=)
ApproachWhen to use
TestModelYou want schema-correct, deterministic behaviour. Fast.
FunctionModelYou need to control the exact ModelResponse.
Real model + pytest-vcrRegression-test against the real provider. Slow, flaky, needs API keys on record.
respx / httpx_mock + real OpenAIModelHTTP-layer testing. Flakier than FunctionModel.

For most unit tests, prefer TestModel or FunctionModel. For contract tests, use real models in a CI nightly job.

1. Test an agent’s contract without calling an LLM

Section titled “1. Test an agent’s contract without calling an LLM”
def test_pricing_tool_is_registered():
tm = TestModel()
with agent.override(model=tm):
agent.run_sync('anything')
names = [t.name for t in tm.last_model_request_parameters.function_tools]
assert 'pricing' in names
@agent.tool
def fragile(ctx):
raise ValueError('boom')
def test_retries_then_succeeds():
with agent.override(model=TestModel()):
result = agent.run_sync('do it')
assert result.output # agent recovered past the tool error

3. Drive a multi-turn protocol with FunctionModel

Section titled “3. Drive a multi-turn protocol with FunctionModel”
def script(messages, info):
step = sum(1 for m in messages if m.kind == 'response')
if step == 0:
return ModelResponse(parts=[ToolCallPart('search', {'q': 'x'})])
if step == 1:
return ModelResponse(parts=[ToolCallPart('refine', {'doc_id': 1})])
return ModelResponse(parts=[TextPart('final')])

4. Assert ModelRetry is triggered by an output validator

Section titled “4. Assert ModelRetry is triggered by an output validator”
from pydantic_ai import ModelRetry
@agent.output_validator
async def must_be_uppercase(ctx, out: str) -> str:
if out != out.upper():
raise ModelRetry('uppercase please')
return out
def test_validator_retries():
with agent.override(model=TestModel(custom_output_text='hello')):
with pytest.raises(UnexpectedModelBehavior):
agent.run_sync('go')

After output_retries attempts, ModelRetry bubbles up as UnexpectedModelBehavior.

5. Swap deps per test without rebuilding the agent

Section titled “5. Swap deps per test without rebuilding the agent”
fake_db = FakeDB([{'sku': 'ABC', 'price': 9.99}])
with agent.override(deps=fake_db, model=TestModel()):
result = agent.run_sync('price ABC')
  • TestModel randomises tool args. Pin behaviour with seed=42 if you assert on them.
  • override leaks if you manually __enter__ without __exit__. Always use with.
  • Async tests: prefer await agent.run(...) inside async def test_* — don’t mix run_sync and a running event loop.
  • include_return_schema: TestModel does not honour tool return schemas unless the agent is set to include them; see IncludeReturnSchemasToolset.
  • TestModelmodels/test.py:60
  • FunctionModel, AgentInfo, DeltaToolCallmodels/function.py
  • Agent.override(...)agent/__init__.py:1639
  • capture_run_messages()_agent_graph.py:1791
  • ModelMessagesTypeAdaptermessages.py:2034