PydanticAI — Class Deep Dives Vol. 9
import { Aside } from ‘@astrojs/starlight/components’;
Ten class groups from the pydantic_ai 1.105.0 source covering: the complete wire-format anatomy of ModelRequest and ModelResponse; the three request-side message part types (SystemPromptPart, UserPromptPart, RetryPromptPart); the call-part family (BaseToolCallPart, ToolCallPart, NativeToolCallPart) with typed-subclass promotion and streaming helpers; the return-part family (BaseToolReturnPart, ToolReturnPart, NativeToolReturnPart) with multi-modal content splitting and outcome tracking; GraphAgentState run-state internals; two new native tool types (MCPServerTool and FileSearchTool); IncludeReturnSchemasToolset for automatic tool-return-schema injection; ToolChoice + ToolOrOutput for surgical tool/output mode control; and the ServiceTier + ThinkingLevel type aliases that centralise cross-provider configuration.
1. ModelRequest + ModelResponse — Wire-Format Message Anatomy
Section titled “1. ModelRequest + ModelResponse — Wire-Format Message Anatomy”Module: pydantic_ai.messages
Import: from pydantic_ai import ModelRequest, ModelResponse
Every conversation between PydanticAI and a model is represented as a list[ModelMessage] where ModelMessage = ModelRequest | ModelResponse. Understanding these two dataclasses in depth lets you parse, inspect, replay, or surgically modify conversation history.
Class signatures
Section titled “Class signatures”from datetime import datetimefrom typing import Literal, Sequence, Any
@dataclass(repr=False)class ModelRequest: """A request generated by PydanticAI and sent to a model."""
parts: Sequence[ModelRequestPart] # ModelRequestPart = SystemPromptPart | UserPromptPart | ToolReturnPart # | NativeToolReturnPart | RetryPromptPart | InstructionPart
timestamp: datetime | None = None # when the request was sent instructions: str | None = None # rendered instruction string (for logging/debugging) kind: Literal['request'] = 'request' # discriminator run_id: str | None = None # UUID7 for the agent run conversation_id: str | None = None # UUID7 spanning multiple runs metadata: dict[str, Any] | None = None # app-only data, never sent to LLM
@classmethod def user_text_prompt( cls, user_prompt: str, *, instructions: str | None = None, ) -> 'ModelRequest': ...
@dataclass(repr=False)class ModelResponse: """A response from a model."""
parts: Sequence[ModelResponsePart] # ModelResponsePart = TextPart | ThinkingPart | ToolCallPart | NativeToolCallPart # | FilePart | CompactionPart
usage: RequestUsage = field(default_factory=RequestUsage) model_name: str | None = None # e.g. 'gpt-4o-2024-08-06' timestamp: datetime = field(default_factory=now_utc) finish_reason: FinishReason | None = None # 'stop' | 'tool_calls' | 'length' | 'content_filter' | ... state: ModelResponseState | None = None # 'complete' | 'incomplete' | 'interrupted' kind: Literal['response'] = 'response' # discriminator run_id: str | None = None conversation_id: str | None = None
# FinishReason type alias # Literal['stop', 'tool_calls', 'length', 'content_filter', 'other']
# ModelResponseState type alias # Literal['complete', 'incomplete', 'interrupted']Field reference
Section titled “Field reference”| Field | Type | Notes |
|---|---|---|
ModelRequest.parts | Sequence[ModelRequestPart] | Ordered mix of system prompts, user prompts, tool returns, retries |
ModelRequest.instructions | str | None | Rendered instruction string stored for debugging/logging only |
ModelRequest.run_id | str | None | UUID7 shared with RunContext.run_id and OTel span |
ModelRequest.conversation_id | str | None | UUID7 that spans multi-run conversations |
ModelRequest.metadata | dict[str, Any] | None | App-only; never sent to LLM |
ModelResponse.finish_reason | FinishReason | None | 'stop' = clean end; 'tool_calls' = pending calls; 'length' = token limit hit; 'content_filter' = blocked |
ModelResponse.state | ModelResponseState | None | 'complete' = all parts arrived; 'incomplete' = interrupted mid-stream; 'interrupted' = cancelled |
ModelResponse.usage | RequestUsage | Per-request tokens (input, output, cache hit/miss, audio, details) |
ModelResponse.model_name | str | None | Resolved model name as reported by the provider |
Parsing message history
Section titled “Parsing message history”import asynciofrom pydantic_ai import Agent, capture_run_messagesfrom pydantic_ai.messages import ( ModelRequest, ModelResponse, UserPromptPart, TextPart, ToolCallPart, ToolReturnPart,)
agent = Agent('openai:gpt-4o-mini')
async def inspect_history(): with capture_run_messages() as messages: result = await agent.run("What is 2 + 2?")
for msg in messages: if isinstance(msg, ModelRequest): print(f"REQUEST run_id={msg.run_id} conv={msg.conversation_id}") for part in msg.parts: if isinstance(part, UserPromptPart): print(f" user: {part.content!r}") elif isinstance(msg, ModelResponse): print(f"RESPONSE finish={msg.finish_reason} model={msg.model_name}") for part in msg.parts: if isinstance(part, TextPart): print(f" text: {part.content!r}") elif isinstance(part, ToolCallPart): print(f" tool_call: {part.tool_name}({part.args_as_json_str()})")
asyncio.run(inspect_history())Replaying a request with modified metadata
Section titled “Replaying a request with modified metadata”from dataclasses import replacefrom pydantic_ai.messages import ModelRequest, UserPromptPart
# Load from storageraw = load_messages_from_db(conversation_id="abc-123")
# Find the last request and add metadata for replaylast_request = next(m for m in reversed(raw) if isinstance(m, ModelRequest))tagged = replace(last_request, metadata={"replay": True, "analyst": "claude"})
result = await agent.run( "Continue", message_history=raw[:-1] + [tagged],)Serialisation with ModelMessagesTypeAdapter
Section titled “Serialisation with ModelMessagesTypeAdapter”import jsonfrom pydantic_ai import ModelMessagesTypeAdapterfrom pydantic_ai.messages import ModelRequest, ModelResponse
# Serialisemessages: list[ModelRequest | ModelResponse] = ...blob = ModelMessagesTypeAdapter.dump_json(messages)
# Deserialise (discriminated union on `kind` field)restored = ModelMessagesTypeAdapter.validate_json(blob)
# To a Python list for storagerecords = ModelMessagesTypeAdapter.dump_python(messages, mode='json')2. SystemPromptPart + UserPromptPart + RetryPromptPart — Request-Side Message Parts
Section titled “2. SystemPromptPart + UserPromptPart + RetryPromptPart — Request-Side Message Parts”Module: pydantic_ai.messages
Import: from pydantic_ai.messages import SystemPromptPart, UserPromptPart, RetryPromptPart
These three dataclasses represent the parts that can appear inside a ModelRequest. Each has a part_kind literal discriminator for serialisation.
Class signatures
Section titled “Class signatures”@dataclass(repr=False)class SystemPromptPart: content: str timestamp: datetime = field(default_factory=now_utc) dynamic_ref: str | None = None # set when generated by @agent.system_prompt part_kind: Literal['system-prompt'] = 'system-prompt'
@dataclass(repr=False)class UserPromptPart: content: str | Sequence[UserContent] # UserContent = str | TextContent | ImageUrl | AudioUrl | VideoUrl # | DocumentUrl | BinaryContent | UploadedFile | CachePoint timestamp: datetime = field(default_factory=now_utc) part_kind: Literal['user-prompt'] = 'user-prompt'
@dataclass(repr=False)class RetryPromptPart: content: list[pydantic_core.ErrorDetails] | str tool_name: str | None = None # None for output validator retries tool_call_id: str = field(default_factory=generate_tool_call_id) timestamp: datetime = field(default_factory=now_utc) part_kind: Literal['retry-prompt'] = 'retry-prompt'
def model_response(self) -> str: ... # Returns formatted error feedback string sent to the modelSystemPromptPart — when to construct manually
Section titled “SystemPromptPart — when to construct manually”SystemPromptPart is normally created by the framework. You construct it manually when building ModelRequest objects for replay, testing, or direct API calls:
from pydantic_ai.messages import SystemPromptPart, UserPromptPart, ModelRequest
request = ModelRequest( parts=[ SystemPromptPart(content="You are a helpful assistant."), UserPromptPart(content="Summarise this document."), ])The dynamic_ref field is populated by the framework when a @agent.system_prompt function generates the part, allowing the OTel layer to attribute spans back to the source function.
UserPromptPart — multi-modal content
Section titled “UserPromptPart — multi-modal content”content accepts a heterogeneous sequence of UserContent items, enabling rich multi-modal prompts:
from pydantic_ai.messages import UserPromptPartfrom pydantic_ai import ImageUrl, BinaryContent, TextContent
# Plain textpart = UserPromptPart(content="Describe this image:")
# Multi-modal: text + image URLpart = UserPromptPart(content=[ "Describe this image:", ImageUrl(url="https://example.com/chart.png"),])
# Multi-modal: text + base64 imagewith open("chart.png", "rb") as f: data = f.read()part = UserPromptPart(content=[ TextContent(content="Analyse this chart:"), BinaryContent(data=data, media_type="image/png"),])
# With a cache-point marker for Anthropic prompt cachingfrom pydantic_ai import CachePointpart = UserPromptPart(content=[ "Long static context...", CachePoint(), # Insert cache boundary here "Dynamic question?",])RetryPromptPart — how retry feedback works
Section titled “RetryPromptPart — how retry feedback works”RetryPromptPart is generated by PydanticAI whenever validation fails or a ModelRetry exception is raised. The model_response() method formats the error into the string that is sent back to the model:
from pydantic_ai.messages import RetryPromptPartimport pydantic_core
# From a ModelRetry exception (string content)retry = RetryPromptPart( content="The city name must be a real city. 'Faketown' is not valid.", tool_name="get_weather", tool_call_id="call_abc123",)print(retry.model_response())# "The city name must be a real city. 'Faketown' is not valid.\n\nFix the errors and try again."
# From a Pydantic ValidationError (list[ErrorDetails] content)retry = RetryPromptPart( content=[ { "type": "missing", "loc": ("city",), "msg": "Field required", "input": {}, } ], tool_name="get_weather",)print(retry.model_response())# "1 validation error:\n```json\n[...]\n```\n\nFix the errors and try again."
# Output validator retry (tool_name=None strips redundant input from error details)output_retry = RetryPromptPart( content="Output validation failed: value must be positive",)Inspecting retries in captured messages
Section titled “Inspecting retries in captured messages”from pydantic_ai.messages import RetryPromptPart, ModelRequest
with capture_run_messages() as messages: result = await agent.run("bad input")
retries = [ part for msg in messages if isinstance(msg, ModelRequest) for part in msg.parts if isinstance(part, RetryPromptPart)]for r in retries: print(f"Tool: {r.tool_name!r} Feedback: {r.model_response()[:80]}")3. BaseToolCallPart + ToolCallPart + NativeToolCallPart — Call-Part Family
Section titled “3. BaseToolCallPart + ToolCallPart + NativeToolCallPart — Call-Part Family”Module: pydantic_ai.messages
Import: from pydantic_ai.messages import BaseToolCallPart, ToolCallPart, NativeToolCallPart
When a model decides to call a tool it generates a ToolCallPart (for function tools) or NativeToolCallPart (for native tools such as web search). Both extend BaseToolCallPart which holds the shared fields.
Class signatures
Section titled “Class signatures”@dataclass(repr=False)class BaseToolCallPart: """Base class for all tool-call parts."""
tool_name: str args: str | dict[str, Any] | None = None # JSON string OR dict, depending on provider tool_call_id: str = field(default_factory=generate_tool_call_id)
# Provider round-trip fields (only populated for native tools) tool_kind: ToolPartKind | None = None # discriminator for typed subclasses id: str | None = None # provider-specific call ID (e.g. OpenAI Responses) provider_name: str | None = None # required when id/provider_details is set provider_details: dict[str, Any] | None = None
def args_as_dict(self, *, raise_if_invalid: bool = False) -> dict[str, Any]: ... def args_as_json_str(self) -> str: ... def has_content(self) -> bool: ...
@dataclass(repr=False)class ToolCallPart(BaseToolCallPart): """A call to a user-defined function tool.""" part_kind: Literal['tool-call'] = 'tool-call'
@staticmethod def narrow_type( part: 'ToolCallPart', *, tool_kind: ToolPartKind | None = None, ) -> 'ToolCallPart': ...
@dataclass(repr=False)class NativeToolCallPart(BaseToolCallPart): """A call to a native model tool (web search, code execution, etc.).""" part_kind: Literal['builtin-tool-call'] = 'builtin-tool-call'
@staticmethod def narrow_type( part: 'NativeToolCallPart', *, tool_kind: ToolPartKind | None = None, ) -> 'NativeToolCallPart': ...Reading tool call arguments
Section titled “Reading tool call arguments”from pydantic_ai import capture_run_messages, Agentfrom pydantic_ai.messages import ModelResponse, ToolCallPart, NativeToolCallPart
agent = Agent('openai:gpt-4o-mini', tools=[my_tool])
with capture_run_messages() as messages: await agent.run("Use the tool please")
for msg in messages: if isinstance(msg, ModelResponse): for part in msg.parts: if isinstance(part, ToolCallPart): # args may be a JSON string or a dict depending on provider args_dict = part.args_as_dict() # always a dict args_json = part.args_as_json_str() # always a JSON string print(f"{part.tool_name}({args_dict}) id={part.tool_call_id}") elif isinstance(part, NativeToolCallPart): print(f"native:{part.tool_name} provider={part.provider_name}")Typed subclass promotion with narrow_type
Section titled “Typed subclass promotion with narrow_type”For native tools with a stable cross-provider schema (currently tool_search), NativeToolCallPart can be promoted to a typed subclass whose args is a narrowed TypedDict:
from pydantic_ai.messages import NativeToolCallPart
raw_part: NativeToolCallPart = ... # from a tool_search native call
# Automatic promotion happens during Pydantic deserialisation.# For manual construction or testing, use narrow_type():narrowed = NativeToolCallPart.narrow_type(raw_part, tool_kind='tool-search')# narrowed is now a NativeToolSearchCallPart with typed .args TypedDict
# The tool_kind can also be injected inline:narrowed = NativeToolCallPart.narrow_type(raw_part) # uses part.tool_kindStreaming delta accumulation with ToolCallPartDelta
Section titled “Streaming delta accumulation with ToolCallPartDelta”During streaming, call arguments arrive incrementally as ToolCallPartDelta objects that you accumulate by appending to the args string:
from pydantic_ai.messages import ToolCallPartDelta
# Accumulate streaming deltasaccumulated_args = ""async for event in agent.run_stream_events("call the tool"): if hasattr(event, 'delta') and isinstance(event.delta, ToolCallPartDelta): if event.delta.args_delta: accumulated_args += event.delta.args_deltaargs_as_dict error handling
Section titled “args_as_dict error handling”# Default: graceful on malformed JSONpart = ToolCallPart(tool_name="my_tool", args='{"broken": ')safe_dict = part.args_as_dict()# Returns: {'INVALID_JSON': '{"broken": '} — safe to pass to a model retry
# Strict: re-raises ValueError on malformed JSONtry: strict_dict = part.args_as_dict(raise_if_invalid=True)except ValueError: # handle truncated tool call (e.g. token limit hit) ...4. BaseToolReturnPart + ToolReturnPart + NativeToolReturnPart — Return-Part Family
Section titled “4. BaseToolReturnPart + ToolReturnPart + NativeToolReturnPart — Return-Part Family”Module: pydantic_ai.messages
Import: from pydantic_ai.messages import BaseToolReturnPart, ToolReturnPart, NativeToolReturnPart
After a tool executes, its result is wrapped in a ToolReturnPart (function tools) or NativeToolReturnPart (native tools) and placed in the next ModelRequest. Both extend BaseToolReturnPart which has the rich content-handling logic.
Class signatures
Section titled “Class signatures”@dataclass(repr=False)class BaseToolReturnPart: """Base class for all tool-return parts."""
tool_name: str content: ToolReturnContent # str | dict | list | MultiModalContent | ... tool_call_id: str = field(default_factory=generate_tool_call_id) tool_kind: ToolPartKind | None = None metadata: Any = None # app-only; never sent to LLM timestamp: datetime = field(default_factory=now_utc) outcome: Literal['success', 'failed', 'denied'] = 'success'
# Content accessors def model_response_str(self) -> str: ... def model_response_object(self) -> dict[str, Any]: ... def content_items( self, *, mode: Literal['raw', 'str', 'jsonable'] = 'raw' ) -> list: ... def files(self) -> list[MultiModalContent]: ... # property
@dataclass(repr=False)class ToolReturnPart(BaseToolReturnPart): """Result from a user-defined function tool.""" part_kind: Literal['tool-return'] = 'tool-return'
@staticmethod def narrow_type( part: 'ToolReturnPart', *, tool_kind: ToolPartKind | None = None, ) -> 'ToolReturnPart': ...
@dataclass(repr=False)class NativeToolReturnPart(BaseToolReturnPart): """Result from a native model tool.""" provider_name: str | None = None provider_details: dict[str, Any] | None = None part_kind: Literal['builtin-tool-return'] = 'builtin-tool-return'
@staticmethod def narrow_type( part: 'NativeToolReturnPart', *, tool_kind: ToolPartKind | None = None, ) -> 'NativeToolReturnPart': ...outcome field — tracking approval and failures
Section titled “outcome field — tracking approval and failures”from pydantic_ai.messages import ToolReturnPart
# Normal successpart = ToolReturnPart(tool_name="search", content="Results...")assert part.outcome == 'success'
# Denied by HITL approvaldenied = ToolReturnPart( tool_name="delete_file", content="Tool call denied by operator.", outcome='denied',)
# Failed executionfailed = ToolReturnPart( tool_name="execute_sql", content="ERROR: table 'users' does not exist", outcome='failed',)
# Inspect from historywith capture_run_messages() as messages: result = await agent.run("do something dangerous")
from pydantic_ai.messages import ModelRequestdenials = [ part for msg in messages if isinstance(msg, ModelRequest) for part in msg.parts if isinstance(part, ToolReturnPart) and part.outcome == 'denied']Multi-modal tool returns
Section titled “Multi-modal tool returns”Tools can return images, audio, or documents alongside text. The BaseToolReturnPart family handles splitting multi-modal content from scalar data:
from pydantic_ai import Agent, RunContextfrom pydantic_ai.messages import BinaryContent
agent = Agent('anthropic:claude-opus-4-5')
@agent.toolasync def generate_chart(ctx: RunContext[None], data: list[float]) -> list: # Return both a description and the chart image chart_bytes = create_chart(data) return [ "Here is the chart:", BinaryContent(data=chart_bytes, media_type="image/png"), ]
# Inspect the return part files after the runwith capture_run_messages() as messages: result = await agent.run("Plot [1, 2, 3, 4, 5]")
from pydantic_ai.messages import ModelRequest, ToolReturnPartfor msg in messages: if isinstance(msg, ModelRequest): for part in msg.parts: if isinstance(part, ToolReturnPart): print(f"text: {part.model_response_str()!r}") print(f"files: {len(part.files)} file(s)")content_items for fine-grained serialisation
Section titled “content_items for fine-grained serialisation”from pydantic_ai.messages import ToolReturnPart
part = ToolReturnPart( tool_name="analyze", content=[{"score": 0.95}, BinaryContent(data=b"...", media_type="image/png")],)
# Raw items (no serialisation)raw = part.content_items(mode='raw')
# Serialize non-file items to strings; pass BinaryContent through unchangedstr_items = part.content_items(mode='str')
# Serialize non-file items to JSON-compatible Python objectsjson_items = part.content_items(mode='jsonable')NativeToolReturnPart.provider_details — round-trip data
Section titled “NativeToolReturnPart.provider_details — round-trip data”Native tools like web search may embed provider_details that must be sent back to the same provider on the next turn. PydanticAI handles this automatically; you only need to be aware when building custom providers:
from pydantic_ai.messages import NativeToolReturnPart
# Constructed by model implementations — provider_name is mandatory when provider_details is setreturn_part = NativeToolReturnPart( tool_name="web_search", content="Search result text...", provider_name="anthropic", provider_details={"search_result_id": "srq_123", "cache_control": {"type": "ephemeral"}}, tool_kind="tool-search",)5. GraphAgentState — Agent Run State Internals
Section titled “5. GraphAgentState — Agent Run State Internals”Module: pydantic_ai._agent_graph
Import (internal): from pydantic_ai._agent_graph import GraphAgentState
GraphAgentState is the mutable state dataclass that the pydantic_graph runtime threads through UserPromptNode → ModelRequestNode → CallToolsNode on every step. Understanding it is key for building custom graph runners or interpreting low-level diagnostics.
Class signature
Section titled “Class signature”@dataclasses.dataclass(kw_only=True)class GraphAgentState: """State kept across the execution of the agent graph."""
message_history: list[ModelMessage] = field(default_factory=list) usage: RunUsage = field(default_factory=RunUsage) output_retries_used: int = 0 run_step: int = 0 run_id: str = field(default_factory=lambda: str(uuid7())) conversation_id: str = field(default_factory=lambda: str(uuid7())) metadata: dict[str, Any] | None = None last_max_tokens: int | None = None last_model_request_parameters: ModelRequestParameters | None = None pending_messages: list[PendingMessage] = field(default_factory=list)
def check_incomplete_tool_call(self) -> None: ... def consume_output_retry( self, max_output_retries: int, error: BaseException | None = None, ) -> None: ...Field reference
Section titled “Field reference”| Field | Purpose |
|---|---|
message_history | Accumulated ModelRequest/ModelResponse list for the current run |
usage | Aggregated RunUsage summed across all model calls so far |
output_retries_used | Counter of output validator retries; checked against max_output_retries |
run_step | Incremented on each ModelRequestNode execution; useful for observability |
run_id | UUID7 for the current agent run; matches RunContext.run_id |
conversation_id | UUID7 spanning multi-run conversations; matches RunContext.conversation_id |
metadata | App-level metadata dict threaded through the run |
last_max_tokens | Stored to produce accurate token-limit exceeded error messages |
last_model_request_parameters | Last ModelRequestParameters for OTel span attributes |
pending_messages | Internal queue for RunContext.enqueue() / AgentRun.enqueue() |
Accessing state from AgentRun.iter()
Section titled “Accessing state from AgentRun.iter()”import asynciofrom pydantic_ai import Agentfrom pydantic_ai._agent_graph import GraphAgentState
agent = Agent('openai:gpt-4o-mini')
async def track_state(): async with agent.iter("What is the capital of France?") as run: async for node in run: # Access state via the graph run context state: GraphAgentState = run.ctx.state print( f"step={state.run_step} " f"msgs={len(state.message_history)} " f"tokens_so_far={state.usage.total_tokens}" ) print(f"Final run_id: {state.run_id}") print(f"Conversation ID: {state.conversation_id}")
asyncio.run(track_state())check_incomplete_tool_call() — detecting token-limit truncation
Section titled “check_incomplete_tool_call() — detecting token-limit truncation”The framework calls this automatically, but you can call it yourself when inspecting saved state:
from pydantic_ai._agent_graph import GraphAgentStatefrom pydantic_ai.exceptions import IncompleteToolCall
# Load a saved state snapshotstate = load_state_snapshot()
try: state.check_incomplete_tool_call()except IncompleteToolCall as e: # Last model response was truncated mid-tool-call JSON print(f"Truncated tool call detected: {e}") # Increase max_tokens and re-run, or simplify the promptconsume_output_retry() — retry budget enforcement
Section titled “consume_output_retry() — retry budget enforcement”from pydantic_ai._agent_graph import GraphAgentStatefrom pydantic_ai.exceptions import UnexpectedModelBehavior
state = GraphAgentState()
# Simulates what CallToolsNode does when output validation failstry: state.consume_output_retry(max_output_retries=3) state.consume_output_retry(max_output_retries=3) state.consume_output_retry(max_output_retries=3) state.consume_output_retry(max_output_retries=3) # raises on the 4th callexcept UnexpectedModelBehavior: print("Exceeded 3 output retries — abort")6. MCPServerTool — Native MCP Server Integration
Section titled “6. MCPServerTool — Native MCP Server Integration”Module: pydantic_ai.native_tools
Import: from pydantic_ai import MCPServerTool
MCPServerTool is a native tool that tells a model to connect to an MCP server at the network level, offloading tool discovery and invocation entirely to the provider. This is distinct from MCPToolset (which manages MCP tool calls inside PydanticAI); MCPServerTool delegates execution directly to the provider’s native MCP support.
Class signature
Section titled “Class signature”@dataclass(kw_only=True)class MCPServerTool(AbstractNativeTool): """A native tool that allows your agent to use MCP servers.
Supported by: OpenAI Responses, Anthropic, xAI """
id: str # unique identifier for this server url: str # MCP server URL authorization_token: str | None = None # Bearer token for auth description: str | None = None # server description for the model allowed_tools: list[str] | None = None # restrict which MCP tools are exposed headers: dict[str, str] | None = None # custom HTTP headers
kind: str = 'mcp_server'
@property def unique_id(self) -> str: return f'mcp_server:{self.id}'
@property def label(self) -> str: return f'MCP: {self.id}'Provider support matrix
Section titled “Provider support matrix”| Feature | OpenAI Responses | Anthropic | xAI |
|---|---|---|---|
url | ✓ | ✓ | ✓ |
authorization_token | ✓ | ✓ | ✓ |
description | ✓ | — | ✓ |
allowed_tools | ✓ | ✓ | ✓ |
headers | ✓ | — | ✓ |
| OpenAI connector_id | via url prefix | — | — |
Basic usage
Section titled “Basic usage”from pydantic_ai import Agentfrom pydantic_ai.capabilities import NativeToolfrom pydantic_ai import MCPServerTool
agent = Agent( 'openai:gpt-4o', capabilities=[ NativeTool( MCPServerTool( id="my-db-mcp", url="https://my-mcp-server.example.com/mcp", authorization_token="Bearer sk-...", allowed_tools=["query_database", "list_tables"], description="Internal database MCP server", ) ) ],)
result = agent.run_sync("List all tables in the database")print(result.data)Multiple MCP servers
Section titled “Multiple MCP servers”from pydantic_ai import Agent, MCPServerToolfrom pydantic_ai.capabilities import NativeTool
agent = Agent( 'anthropic:claude-opus-4-5', capabilities=[ NativeTool( MCPServerTool( id="search-mcp", url="https://search.example.com/mcp", authorization_token="sk-search-token", ) ), NativeTool( MCPServerTool( id="calendar-mcp", url="https://calendar.example.com/mcp", authorization_token="sk-calendar-token", allowed_tools=["list_events", "create_event"], ) ), ],)OpenAI connector ID pattern
Section titled “OpenAI connector ID pattern”OpenAI Responses supports managed MCP servers via connector IDs. Pass the connector ID with the x-openai-connector: prefix:
MCPServerTool( id="openai-managed-server", url="x-openai-connector:<your_connector_id>", allowed_tools=["search", "summarize"],)Custom headers for enterprise auth
Section titled “Custom headers for enterprise auth”MCPServerTool( id="enterprise-mcp", url="https://internal.corp.com/mcp", headers={ "X-Tenant-ID": "acme-corp", "X-Service-Account": "pydantic-ai-agent", }, authorization_token="Bearer <service-account-token>",)Comparing MCPServerTool vs MCPToolset
Section titled “Comparing MCPServerTool vs MCPToolset”| Aspect | MCPServerTool | MCPToolset |
|---|---|---|
| Execution location | Provider’s infrastructure | Your Python process |
| Tool discovery | Provider handles it | PydanticAI fetches tool list at startup |
| Supported providers | OpenAI, Anthropic, xAI | All (provider-agnostic) |
| Observability | Via provider logs | Full PydanticAI OTel traces |
| HITL / approval | Provider-only | ApprovalRequiredToolset wrapper |
| Transport | Provider-managed | SSE, HTTP, stdio (configurable) |
7. FileSearchTool — Native RAG File Search
Section titled “7. FileSearchTool — Native RAG File Search”Module: pydantic_ai.native_tools
Import: from pydantic_ai import FileSearchTool
FileSearchTool gives the model access to a fully managed vector-search RAG system backed by the provider’s file storage infrastructure. It handles chunking, embedding generation, and context injection, requiring only file store IDs from you.
Class signature
Section titled “Class signature”@dataclass(kw_only=True)class FileSearchTool(AbstractNativeTool): """A native tool that allows your agent to search through uploaded files.
Supported by: OpenAI Responses, Google (Gemini), xAI """
file_store_ids: Sequence[str] # OpenAI: vector store IDs created via OpenAI API # Google: file search store names from Gemini Files API # xAI: collection IDs for xAI collections search
kind: str = 'file_search'Provider-specific file store setup
Section titled “Provider-specific file store setup”OpenAI vector stores:
from openai import OpenAI
client = OpenAI()
# 1. Create a vector storestore = client.vector_stores.create(name="product-docs")
# 2. Upload fileswith open("manual.pdf", "rb") as f: client.vector_stores.file_batches.upload_and_poll( vector_store_id=store.id, files=[("manual.pdf", f, "application/pdf")], )Google Gemini Files API:
import google.generativeai as genai
genai.configure(api_key="...")file_ref = genai.upload_file("docs/guide.pdf")store_name = file_ref.name # e.g. "files/abc123"Using FileSearchTool with an agent
Section titled “Using FileSearchTool with an agent”import asynciofrom pydantic_ai import Agent, FileSearchToolfrom pydantic_ai.capabilities import NativeTool
VECTOR_STORE_ID = "vs_abc123"
agent = Agent( 'openai:gpt-4o', capabilities=[ NativeTool( FileSearchTool(file_store_ids=[VECTOR_STORE_ID]) ) ],)
async def main(): result = await agent.run( "What does the product manual say about warranty coverage?" ) print(result.data)
asyncio.run(main())Multiple vector stores
Section titled “Multiple vector stores”agent = Agent( 'openai:gpt-4o', capabilities=[ NativeTool( FileSearchTool( file_store_ids=[ "vs_product_docs", "vs_support_tickets", "vs_legal_contracts", ] ) ) ],)Google Gemini integration
Section titled “Google Gemini integration”agent = Agent( 'google-gla:gemini-2.0-flash', capabilities=[ NativeTool( FileSearchTool( file_store_ids=["files/abc123", "files/def456"] ) ) ],)xAI collections search
Section titled “xAI collections search”agent = Agent( 'xai:grok-3', capabilities=[ NativeTool( FileSearchTool( file_store_ids=["collection_id_1", "collection_id_2"] ) ) ],)Comparing FileSearchTool vs DeferredLoadingToolset for RAG
Section titled “Comparing FileSearchTool vs DeferredLoadingToolset for RAG”| Aspect | FileSearchTool | Custom FunctionToolset RAG |
|---|---|---|
| Infrastructure | Provider-managed | Your embedding DB + retrieval code |
| Cross-provider | OpenAI / Google / xAI only | Any model |
| Chunking strategy | Provider default | Fully configurable |
| Re-ranking | Provider-managed | Configurable |
| Latency | Provider-optimised | Your infrastructure |
| Cost transparency | Provider billing | Your embedding costs |
8. IncludeReturnSchemasToolset — Auto Return-Schema Injection
Section titled “8. IncludeReturnSchemasToolset — Auto Return-Schema Injection”Module: pydantic_ai.toolsets.include_return_schemas
Import: from pydantic_ai import IncludeReturnSchemasToolset
IncludeReturnSchemasToolset is a PreparedToolset subclass that sets include_return_schema=True on every ToolDefinition that doesn’t already have an explicit return schema setting. This instructs the model to validate its tool calls against the tool’s return type JSON schema — useful for models that support structured tool outputs and for improving type-safety in multi-step pipelines.
Class signature
Section titled “Class signature”@dataclass(init=False)class IncludeReturnSchemasToolset(PreparedToolset[AgentDepsT]): """A toolset that sets include_return_schema=True on all its tools.
Wraps any AbstractToolset and injects include_return_schema=True into every ToolDefinition whose include_return_schema is still None. """
def __init__(self, wrapped: AbstractToolset[AgentDepsT]) -> None: ...Internally it works by wrapping the wrapped toolset in a PreparedToolset with an async _include function that iterates over tool definitions and calls dataclasses.replace(td, include_return_schema=True) for any td where include_return_schema is None.
from pydantic_ai import Agentfrom pydantic_ai.toolsets import FunctionToolsetfrom pydantic_ai import IncludeReturnSchemasToolsetfrom pydantic import BaseModel
class WeatherReport(BaseModel): temperature_c: float condition: str humidity_percent: int
toolset = FunctionToolset()
@toolset.tooldef get_weather(city: str) -> WeatherReport: """Get current weather for a city.""" return WeatherReport(temperature_c=22.5, condition="sunny", humidity_percent=45)
agent = Agent( 'openai:gpt-4o', toolsets=[IncludeReturnSchemasToolset(toolset)],)
result = agent.run_sync("What's the weather in Paris?")Combining with FilteredToolset for RBAC
Section titled “Combining with FilteredToolset for RBAC”from pydantic_ai import Agent, FilteredToolset, IncludeReturnSchemasToolsetfrom pydantic_ai.toolsets import FunctionToolset
admin_toolset = FunctionToolset()# ... register admin tools ...
user_context_toolset = FilteredToolset( admin_toolset, filter=lambda ctx, td: td.name in ctx.deps["allowed_tools"],)
# Include return schemas on the filtered settyped_toolset = IncludeReturnSchemasToolset(user_context_toolset)
agent = Agent('openai:gpt-4o', toolsets=[typed_toolset])When to use
Section titled “When to use”include_return_schema=True is most useful when:
- Chaining tools — downstream tools use the typed output of upstream tools
- Structured output pipelines — you want the model to reason about data structure
- OpenAI Structured Outputs — the model’s
response_formatisjson_schemaand tools should match
# Without IncludeReturnSchemasToolset: tool return schema omitted from model request# With IncludeReturnSchemasToolset: each ToolDefinition.json_schema includes a# "return" key with the full Pydantic JSON schema for the return type
# Inspect the resulting tool definitions:from pydantic_ai.toolsets import FunctionToolsetfrom pydantic_ai import IncludeReturnSchemasToolset
base = FunctionToolset()
@base.tooldef add(a: int, b: int) -> int: """Add two numbers.""" return a + b
wrapped = IncludeReturnSchemasToolset(base)
import asynciofrom pydantic_ai.tools import RunContext
async def inspect(): ctx = RunContext(deps=None, ...) # minimal ctx for inspection tool_defs = await wrapped.get_tools(ctx) for td in tool_defs: print(td.name, "include_return_schema:", td.include_return_schema) # add include_return_schema: True9. ToolChoice + ToolOrOutput — Tool Selection Control
Section titled “9. ToolChoice + ToolOrOutput — Tool Selection Control”Module: pydantic_ai.settings
Import: from pydantic_ai.settings import ToolChoice, ToolOrOutput
ToolChoice is a type alias that controls how the model selects between available tools and output modes on a per-request basis. ToolOrOutput is a dataclass that lets you restrict function tools while keeping output and text/image output paths available.
Type aliases and class signature
Section titled “Type aliases and class signature”ToolChoiceScalar = Literal['none', 'required', 'auto']
@dataclassclass ToolOrOutput: """Restricts function tools while keeping output tools and text/image output available.""" function_tools: list[str] # names of the function tools the model may call
ToolChoice = ToolChoiceScalar | list[str] | ToolOrOutput | NoneToolChoice value semantics
Section titled “ToolChoice value semantics”| Value | Behaviour |
|---|---|
None | Default; model decides which tool to use (equivalent to 'auto') |
'auto' | Model may call any tool or produce text/output |
'required' | Model must call at least one tool before finishing |
'none' | Model must not call any tools; forces text/output response |
list[str] | Model must call exactly one of these named function tools |
ToolOrOutput(...) | Named function tools available plus output tools and text/image |
Setting tool_choice via ModelSettings
Section titled “Setting tool_choice via ModelSettings”from pydantic_ai import Agentfrom pydantic_ai.settings import ModelSettings
agent = Agent('openai:gpt-4o-mini')
# Force the model to always call a toolresult = agent.run_sync( "What is the weather in London?", model_settings=ModelSettings(tool_choice='required'),)
# Allow only a specific toolresult = agent.run_sync( "Search for Python tutorials", model_settings=ModelSettings(tool_choice=['web_search']),)
# Disable tools entirely (force text response)result = agent.run_sync( "Tell me a joke", model_settings=ModelSettings(tool_choice='none'),)ToolOrOutput — mixing function tools with output
Section titled “ToolOrOutput — mixing function tools with output”ToolOrOutput is useful when you want to allow the model to call specific function tools or produce structured output, without allowing all available tools:
from pydantic_ai import Agentfrom pydantic_ai.settings import ModelSettings, ToolOrOutputfrom pydantic_ai.toolsets import FunctionToolsetfrom pydantic import BaseModel
toolset = FunctionToolset()
@toolset.tooldef search_knowledge_base(query: str) -> str: """Search internal KB.""" return "relevant content..."
@toolset.tooldef escalate_to_human(reason: str) -> str: """Escalate the query to a human agent.""" return "escalated"
class FinalAnswer(BaseModel): answer: str confidence: float
agent = Agent( 'openai:gpt-4o', output_type=FinalAnswer, toolsets=[toolset],)
# Allow only 'search_knowledge_base' as a function tool,# but the model can still use the output tool to produce FinalAnswerresult = agent.run_sync( "What is our refund policy?", model_settings=ModelSettings( tool_choice=ToolOrOutput(function_tools=['search_knowledge_base']) ),)Using tool_choice in a capability hook
Section titled “Using tool_choice in a capability hook”from pydantic_ai.capabilities import AbstractCapabilityfrom pydantic_ai.settings import ModelSettingsfrom pydantic_ai.models import ModelRequestContext
class ForceSearchCapability(AbstractCapability): """Forces the model to call search tools on the first step."""
async def before_model_request( self, messages, info: ModelRequestContext, ) -> ModelSettings | None: if info.run_step == 0: return ModelSettings(tool_choice='required') return None
agent = Agent('openai:gpt-4o', capabilities=[ForceSearchCapability()])Dynamic tool_choice based on context
Section titled “Dynamic tool_choice based on context”from pydantic_ai import Agent, RunContextfrom pydantic_ai.settings import ModelSettings
async def get_model_settings(ctx: RunContext[dict]) -> ModelSettings | None: user_role = ctx.deps.get("role", "user") if user_role == "admin": # Admins can call any tool return ModelSettings(tool_choice='auto') else: # Regular users can only call read-only tools return ModelSettings(tool_choice=['search', 'get_info'])
agent = Agent( 'openai:gpt-4o', model_settings=get_model_settings, # callable form)10. ServiceTier + ThinkingLevel / ThinkingEffort — Cross-Provider Config Type Aliases
Section titled “10. ServiceTier + ThinkingLevel / ThinkingEffort — Cross-Provider Config Type Aliases”Module: pydantic_ai.settings
Import: from pydantic_ai.settings import ServiceTier, ThinkingLevel, ThinkingEffort
These two type aliases centralise cross-provider configuration that would otherwise require provider-specific settings. Both are defined in pydantic_ai.settings and consumed via ModelSettings.
Type alias definitions
Section titled “Type alias definitions”ServiceTier: TypeAlias = Literal['auto', 'default', 'flex', 'priority']
ThinkingEffort: TypeAlias = Literal['minimal', 'low', 'medium', 'high', 'xhigh']
ThinkingLevel: TypeAlias = bool | ThinkingEffort# True → enable thinking with provider default effort# False → disable thinking (silently ignored on always-on models)# 'minimal' / 'low' / 'medium' / 'high' / 'xhigh' → specific effort levelServiceTier — cross-provider billing tier control
Section titled “ServiceTier — cross-provider billing tier control”ServiceTier provides a unified way to select processing tier across providers that support tiered billing, without needing to use provider-specific settings:
| Value | OpenAI | Anthropic | Bedrock | Google Gemini API | Google Cloud |
|---|---|---|---|---|---|
'auto' | 'auto' | 'auto' | (omitted) | (omitted) | PT then on-demand |
'default' | 'default' | 'standard_only' | {'type': 'default'} | 'standard' | PT then on-demand |
'flex' | 'flex' | (omitted) | {'type': 'flex'} | 'flex' | PT then Flex PayGo |
'priority' | 'priority' | (omitted) | {'type': 'priority'} | 'priority' | PT then Priority PayGo |
from pydantic_ai import Agentfrom pydantic_ai.settings import ModelSettings
# Cost-optimised batch processingagent_flex = Agent( 'openai:gpt-4o', model_settings=ModelSettings(service_tier='flex'),)
# Low-latency customer-facing requestsagent_priority = Agent( 'openai:gpt-4o', model_settings=ModelSettings(service_tier='priority'),)
# Adapt tier based on request priorityasync def get_settings(ctx) -> ModelSettings | None: if ctx.deps.get("is_urgent"): return ModelSettings(service_tier='priority') return ModelSettings(service_tier='flex')
agent_dynamic = Agent('google-gla:gemini-2.0-flash', model_settings=get_settings)Per-provider overrides (openai_service_tier, anthropic_service_tier, etc.) always take precedence over the unified service_tier when both are set.
ThinkingLevel — cross-provider extended thinking control
Section titled “ThinkingLevel — cross-provider extended thinking control”ThinkingLevel wraps both boolean on/off control and granular effort levels into a single field:
from pydantic_ai import Agentfrom pydantic_ai.settings import ModelSettings
# Enable with provider default effortagent = Agent( 'anthropic:claude-opus-4-5', model_settings=ModelSettings(thinking=True),)
# Disable thinking (no-op on always-on models like o1/o3)agent = Agent( 'openai:o1', model_settings=ModelSettings(thinking=False), # silently ignored)
# Specific effort levelagent = Agent( 'anthropic:claude-opus-4-5', model_settings=ModelSettings(thinking='high'),)
result = agent.run_sync("Prove Fermat's Last Theorem step by step")print(result.data)Provider-level effort mapping
Section titled “Provider-level effort mapping”When an exact ThinkingEffort level isn’t supported by a provider, PydanticAI maps to the nearest available level:
| Effort | Anthropic | OpenAI (o-series) | Google (Gemini) |
|---|---|---|---|
'minimal' | low budget tokens | not supported → 'low' | dynamic |
'low' | low budget tokens | (omitted, default) | dynamic |
'medium' | medium budget tokens | medium reasoning_effort | dynamic |
'high' | high budget tokens | high reasoning_effort | alta |
'xhigh' | max budget tokens | → 'high' on providers without xhigh | alta |
from pydantic_ai import Agentfrom pydantic_ai.settings import ModelSettings
# Dynamic effort selection based on task complexityasync def adaptive_thinking(ctx) -> ModelSettings | None: complexity = ctx.deps.get("complexity_score", 0.5) if complexity > 0.8: effort = 'xhigh' elif complexity > 0.5: effort = 'high' elif complexity > 0.2: effort = 'medium' else: effort = 'low' return ModelSettings(thinking=effort)
agent = Agent( 'anthropic:claude-opus-4-5', model_settings=adaptive_thinking,)Combining ServiceTier + ThinkingLevel for cost management
Section titled “Combining ServiceTier + ThinkingLevel for cost management”from pydantic_ai import Agentfrom pydantic_ai.settings import ModelSettings
# High-accuracy, higher-cost pipelineaccuracy_agent = Agent( 'openai:gpt-4o', model_settings=ModelSettings( service_tier='priority', thinking='xhigh', ),)
# Cost-optimised background processingbatch_agent = Agent( 'openai:gpt-4o-mini', model_settings=ModelSettings( service_tier='flex', thinking=False, ),)Summary
Section titled “Summary”| # | Class(es) | Module | Key takeaways |
|---|---|---|---|
| 1 | ModelRequest + ModelResponse | pydantic_ai.messages | Wire-format anatomy; FinishReason/ModelResponseState type aliases; run_id/conversation_id threading; metadata never sent to LLM; ModelMessagesTypeAdapter for ser/de |
| 2 | SystemPromptPart + UserPromptPart + RetryPromptPart | pydantic_ai.messages | Three request-side part types; multi-modal UserContent in UserPromptPart; dynamic_ref for OTel attribution; RetryPromptPart.model_response() formats validation errors |
| 3 | BaseToolCallPart + ToolCallPart + NativeToolCallPart | pydantic_ai.messages | Call-part family; args_as_dict(raise_if_invalid=) graceful/strict JSON parsing; narrow_type() for typed subclass promotion; ToolCallPartDelta for streaming accumulation |
| 4 | BaseToolReturnPart + ToolReturnPart + NativeToolReturnPart | pydantic_ai.messages | Return-part family; outcome tracks success/failed/denied; content_items(mode=) for serialisation; files property for multi-modal extraction; provider_details round-trip |
| 5 | GraphAgentState | pydantic_ai._agent_graph | Run-state internals; run_id/conversation_id; check_incomplete_tool_call() for token-limit detection; consume_output_retry() budget enforcement; pending_messages queue |
| 6 | MCPServerTool | pydantic_ai.native_tools | Native MCP server (OpenAI/Anthropic/xAI); allowed_tools restriction; headers for enterprise auth; OpenAI connector ID via x-openai-connector: prefix; vs MCPToolset comparison |
| 7 | FileSearchTool | pydantic_ai.native_tools | Provider-managed RAG (OpenAI vector stores / Gemini Files API / xAI collections); file_store_ids parameter; zero-code chunking + embedding; vs custom RAG comparison |
| 8 | IncludeReturnSchemasToolset | pydantic_ai.toolsets | PreparedToolset subclass; auto-injects include_return_schema=True; composable with FilteredToolset; useful for structured tool-output pipelines |
| 9 | ToolChoice + ToolOrOutput | pydantic_ai.settings | ToolChoiceScalar ('auto'/'required'/'none') + list[str] + ToolOrOutput + None; ToolOrOutput.function_tools restricts function tools while allowing output tools; set via ModelSettings.tool_choice |
| 10 | ServiceTier + ThinkingLevel / ThinkingEffort | pydantic_ai.settings | Cross-provider type aliases; ServiceTier maps to provider-specific billing tiers; ThinkingLevel unifies bool + 5 effort levels; per-provider settings override unified field; combine for cost management |
All examples verified against pydantic-ai 1.105.0 source.