Pydantic AI: Comprehensive Technical Guide
Latest: 1.90.0 | Updated: May 5, 2026
Pydantic AI: Comprehensive Technical Guide
Section titled “Pydantic AI: Comprehensive Technical Guide”From Beginner to Expert Level
Section titled “From Beginner to Expert Level”Version: 1.90.0 (May 2026)
Framework: Pydantic AI - GenAI Agent Framework, the Pydantic Way
Author Notes: Exhaustive technical documentation with production patterns, type safety emphasis, and FastAPI-inspired developer experience.
Table of Contents
Section titled “Table of Contents”- Philosophy & Core Concepts
- Installation & Setup
- Core Fundamentals
- Simple Agents
- Type Safety & Validation
- Structured Output
- Tools & Function Calling
- Dependency Injection
- Advanced Patterns
- Production Deployment
Philosophy & Core Concepts
Section titled “Philosophy & Core Concepts””FastAPI Feeling” for GenAI
Section titled “”FastAPI Feeling” for GenAI”Pydantic AI brings the ergonomic design of FastAPI to Generative AI development. This means:
- Type Safety First: Leveraging Python’s type system and Pydantic v2 for automatic validation
- Developer Experience: Familiar decorators, dependency injection, and structured patterns
- Pythonic Conventions: Modern Python 3.10+ features like type hints and async/await
- Reusability: Agents are instantiated once and reused throughout the application
- Testability: Built-in testing utilities and model mocking capabilities
Core Philosophy Pillars
Section titled “Core Philosophy Pillars”"""Pydantic AI Philosophy:1. Type Safety by Default - All inputs/outputs validated with Pydantic2. Model Agnosticism - Single interface for OpenAI, Anthropic, Gemini, Groq, etc.3. Structured Outputs - Guarantee response validation and schema compliance4. Observable Systems - Built-in Logfire integration for production observability5. Composable Tools - Function calling as first-class citizens6. Async-First Design - Native async/await throughout7. Test-Friendly - TestModel for unit testing without API calls"""Why Pydantic AI?
Section titled “Why Pydantic AI?”| Challenge | Solution |
|---|---|
| Unpredictable LLM outputs | Type-safe structured outputs with Pydantic validation |
| Model lock-in | Unified interface for all major LLM providers |
| Complex tool orchestration | Decorator-based tool definition with automatic schema generation |
| State management | Dependency injection system with RunContext |
| Production observability | Logfire integration for traces and monitoring |
| Testing complexity | TestModel and FunctionModel for easy unit testing |
| Tool dependencies | Context-aware tool parameters with automatic injection |
Installation & Setup
Section titled “Installation & Setup”Option 1: Complete Installation with All Extras
Section titled “Option 1: Complete Installation with All Extras”# Using pippip install pydantic-ai[all]
# Using uv (faster)uv add pydantic-ai[all]Option 2: Minimal Installation (pydantic-ai-slim)
Section titled “Option 2: Minimal Installation (pydantic-ai-slim)”The slim version is significantly smaller and downloads only necessary dependencies:
# Core slim with OpenAI supportpip install "pydantic-ai-slim[openai]"uv add "pydantic-ai-slim[openai]"Option 3: Selective Installation by Provider
Section titled “Option 3: Selective Installation by Provider”# OpenAI onlypip install "pydantic-ai-slim[openai]"
# Anthropic Claudepip install "pydantic-ai-slim[anthropic]"
# Google Geminipip install "pydantic-ai-slim[google]"
# Groq (fast inference)pip install "pydantic-ai-slim[groq]"
# Multiple providerspip install "pydantic-ai-slim[openai,anthropic,google,groq]"
# With observabilitypip install "pydantic-ai-slim[openai,logfire]"
# For MCP integrationpip install "pydantic-ai-slim[mcp]"
# For durable executionpip install "pydantic-ai[prefect]" # Prefect integrationpip install "pydantic-ai[dbos]" # DBOS integrationEnvironment Setup
Section titled “Environment Setup”# .env fileOPENAI_API_KEY=sk-...ANTHROPIC_API_KEY=sk-ant-...GOOGLE_API_KEY=...GROQ_API_KEY=...
# Optional: ObservabilityLOGFIRE_TOKEN=...# main.py - Load environment variablesimport osfrom dotenv import load_dotenv
load_dotenv()
# Verify setupassert os.getenv('OPENAI_API_KEY'), "OPENAI_API_KEY not set"Verification: Hello World
Section titled “Verification: Hello World”from pydantic_ai import Agent
# Create minimal agentagent = Agent('openai:gpt-4o')
# Test synchronouslyresult = agent.run_sync('What is 2 + 2?')print(result.output)#> 2 + 2 equals 4.
# Check token usageprint(result.usage())#> RunUsage(input_tokens=14, output_tokens=5, requests=1)Core Fundamentals
Section titled “Core Fundamentals”Core Classes Overview
Section titled “Core Classes Overview”1. Agent
Section titled “1. Agent”The primary class for creating AI agents. Instances are typically created once and reused.
from pydantic_ai import Agentfrom typing import Optional
# Minimal agentagent = Agent('openai:gpt-4o')
# Agent with instructionsagent_with_instructions = Agent( 'openai:gpt-4o', instructions='Be concise and professional. Reply with 1-2 sentences.')
# Agent with dependenciesfrom dataclasses import dataclass
@dataclassclass UserContext: user_id: int username: str
agent_with_deps = Agent( 'openai:gpt-4o', deps_type=UserContext, instructions='Personalise all responses using the user context.')
# Complete agent configurationagent_complete = Agent( model='openai:gpt-4o', system_prompt='You are a helpful assistant specializing in Python.', instructions='Provide clear, working code examples.', deps_type=UserContext, output_type=Optional[str], retries=2, # Retry failed calls up to 2 times name='PythonHelper')2. RunContext
Section titled “2. RunContext”Provides access to dependencies, model information, and message history during execution.
from pydantic_ai import Agent, RunContextfrom dataclasses import dataclass
@dataclassclass AppDependencies: database_url: str api_key: str
agent = Agent( 'openai:gpt-4o', deps_type=AppDependencies,)
@agent.toolasync def fetch_user_data(ctx: RunContext[AppDependencies], user_id: int) -> str: """ Tool with access to context.
Args: ctx: RunContext containing dependencies and metadata user_id: The user identifier """ # Access dependencies db_url = ctx.deps.database_url
# Access model information model_name = ctx.model.model_name
# Access message history messages = ctx.messages
# Access full message history with all messages all_messages = ctx.all_messages()
return f"User {user_id} data from {db_url}"3. ModelRetry
Section titled “3. ModelRetry”Instructs the model to retry with corrected outputs. Used in validation workflows.
from pydantic_ai import Agent, ModelRetry, RunContextfrom pydantic import BaseModel, Fieldimport re
class EmailAddress(BaseModel): email: str = Field(..., description="Valid email address") name: str = Field(..., description="User name")
agent = Agent( 'openai:gpt-4o', output_type=EmailAddress)
@agent.output_validatorasync def validate_email(ctx: RunContext, output: EmailAddress) -> EmailAddress: """Validate email format and retry if invalid."""
# Simple email regex validation email_pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
if not re.match(email_pattern, output.email): raise ModelRetry( f'Invalid email format: {output.email}. Please provide a valid email address.' )
if len(output.name) < 2: raise ModelRetry( f'Name too short: {output.name}. Please provide a full name.' )
return output
# Usageresult = agent.run_sync('Extract email from: Contact John Doe at john@example.com')print(result.output)#> EmailAddress(email='john@example.com', name='John Doe')4. Tool Definition
Section titled “4. Tool Definition”Functions decorated with @agent.tool become callable by the LLM.
from pydantic_ai import Agent, RunContextfrom typing import Anyimport asyncio
agent = Agent('openai:gpt-4o')
# Tool with context@agent.toolasync def search_database( ctx: RunContext, query: str, limit: int = 10) -> str: """ Search the database for documents.
Args: ctx: Execution context query: Search query limit: Maximum results (default: 10)
Returns: Search results as formatted string """ # Simulate database search await asyncio.sleep(0.1) return f"Found {limit} results for '{query}'"
# Tool without context (plain tool)@agent.tool_plaindef get_current_time() -> str: """Get current server time in ISO format.""" from datetime import datetime return datetime.now().isoformat()
# Tool with strict schema (for OpenAI compatibility)@agent.tool(strict=True)async def calculate(ctx: RunContext, a: int, b: int, operation: str) -> int: """ Perform mathematical operations.
Args: ctx: Execution context a: First number b: Second number operation: 'add', 'subtract', 'multiply', 'divide' """ operations = { 'add': lambda x, y: x + y, 'subtract': lambda x, y: x - y, 'multiply': lambda x, y: x * y, 'divide': lambda x, y: x // y, } return operations[operation](a, b)
# Usageresult = agent.run_sync('What time is it?')print(result.output)#> The current time is 2025-03-18T14:30:45.123456Model-Agnostic Design
Section titled “Model-Agnostic Design”Pydantic AI supports numerous LLM providers with a unified interface:
from pydantic_ai import Agent
# OpenAIopenai_agent = Agent('openai:gpt-4o')openai_o3 = Agent('openai:o3-mini')
# Anthropic Claudeclaude_agent = Agent('anthropic:claude-3-5-sonnet-latest')claude_opus = Agent('anthropic:claude-3-opus-20250219')
# Google Geminigemini_agent = Agent('google-gla:gemini-1.5-flash')gemini_pro = Agent('google-gla:gemini-1.5-pro')
# Groq (fast inference)groq_agent = Agent('groq:llama-3.3-70b-versatile')
# DeepSeekdeepseek_agent = Agent('deepseek:deepseek-chat')
# Mistralmistral_agent = Agent('mistral:mistral-large-latest')
# Grokgrok_agent = Agent('grok:grok-2-latest')
# Amazon Bedrockbedrock_agent = Agent('bedrock:anthropic.claude-3-sonnet-20240229-v1:0')
# Perplexity (OpenAI-compatible)from pydantic_ai.models.openai import OpenAIChatModel, OpenAIProvider
perplexity = OpenAIChatModel( 'sonar-pro', provider=OpenAIProvider( base_url='https://api.perplexity.ai', api_key='your-api-key' ))perplexity_agent = Agent(perplexity)
# Fallback strategy - try primary, then backupfrom pydantic_ai.models.fallback import FallbackModelfrom pydantic_ai.models.openai import OpenAIChatModelfrom pydantic_ai.models.anthropic import AnthropicModel
fallback_model = FallbackModel( OpenAIChatModel('gpt-4o'), # Primary AnthropicModel('claude-3-5-sonnet-latest') # Fallback)fallback_agent = Agent(fallback_model)Configuration Patterns
Section titled “Configuration Patterns”from pydantic_ai import Agent, ModelSettingsfrom pydantic import BaseModel
# Configuration via constructorconfigured_agent = Agent( 'openai:gpt-4o', instructions='Be concise.', retries=3, # Retry failed calls name='MyAgent')
# Configuration via settingssettings = ModelSettings( temperature=0.7, max_tokens=500, top_p=0.95, frequency_penalty=0.0, presence_penalty=0.0,)
# Using settings in runresult = configured_agent.run_sync( 'What is Python?', model_settings=settings)
# Custom output type configurationclass Article(BaseModel): title: str content: str keywords: list[str]
article_agent = Agent( 'openai:gpt-4o', output_type=Article, instructions='Write comprehensive technical articles.')
# Usage limitsfrom pydantic_ai import UsageLimits
result = article_agent.run_sync( 'Write about type safety in Python', usage_limits=UsageLimits( request_limit=2, # Max 2 API calls total_tokens_limit=2000 # Max 2000 tokens total ))Simple Agents
Section titled “Simple Agents”Creating Your First Agent
Section titled “Creating Your First Agent”from pydantic_ai import Agentimport asyncio
# 1. Create agent with modelagent = Agent( 'openai:gpt-4o', instructions='Respond with exactly one sentence.')
# 2. Run synchronously (for simple scripts)result = agent.run_sync('What is the capital of France?')print(f"Answer: {result.output}")#> Answer: The capital of France is Paris.
# 3. Access usage informationusage = result.usage()print(f"Tokens: {usage.input_tokens} input, {usage.output_tokens} output")#> Tokens: 18 input, 8 output
# 4. Run asynchronously (for production)async def async_example(): result = await agent.run('Explain type safety in Python.') return result.output
# Executeoutput = asyncio.run(async_example())print(output)#> Type safety refers to the language's ability to prevent type errors...Function Definitions with Full Typing
Section titled “Function Definitions with Full Typing”from pydantic_ai import Agent, RunContextfrom typing import Optionalfrom datetime import datetimeimport asyncio
agent = Agent('openai:gpt-4o')
# Tool with complete type annotations@agent.toolasync def get_weather( ctx: RunContext, location: str, unit: str = 'celsius') -> dict[str, Any]: """ Get weather information for a location.
This tool demonstrates: - Type-annotated parameters - Optional parameters with defaults - Complex return types - Docstring format for schema generation
Args: ctx: Execution context location: City name or coordinates unit: Temperature unit ('celsius' or 'fahrenheit')
Returns: Dictionary with temperature, condition, and forecast """ from typing import Any
# Simulate API call await asyncio.sleep(0.2)
return { 'location': location, 'temperature': 22, 'unit': unit, 'condition': 'Partly cloudy', 'humidity': 65, 'wind_speed': 12 }
@agent.toolasync def search_documents( ctx: RunContext, query: str, semantic: bool = True, max_results: int = 5) -> list[dict[str, str]]: """ Search through document database.
Args: ctx: Execution context query: Search query string semantic: Whether to use semantic search max_results: Maximum results to return
Returns: List of matching documents with id, title, and relevance """ return [ {'id': '1', 'title': 'Python Types', 'relevance': 0.95}, {'id': '2', 'title': 'Type Hints', 'relevance': 0.87}, ]
# Tool with enums for type safetyfrom enum import Enum
class TemperatureUnit(str, Enum): CELSIUS = 'celsius' FAHRENHEIT = 'fahrenheit' KELVIN = 'kelvin'
@agent.toolasync def convert_temperature( ctx: RunContext, value: float, from_unit: TemperatureUnit, to_unit: TemperatureUnit) -> float: """ Convert temperature between units.
Args: ctx: Execution context value: Temperature value from_unit: Source unit to_unit: Target unit
Returns: Converted temperature value """ conversions = { (TemperatureUnit.CELSIUS, TemperatureUnit.FAHRENHEIT): lambda v: v * 9/5 + 32, (TemperatureUnit.FAHRENHEIT, TemperatureUnit.CELSIUS): lambda v: (v - 32) * 5/9, (TemperatureUnit.CELSIUS, TemperatureUnit.KELVIN): lambda v: v + 273.15, } return conversions.get((from_unit, to_unit), lambda v: v)(value)
# Usageresult = agent.run_sync('What is the weather in London?')print(result.output)System Prompts and Configuration
Section titled “System Prompts and Configuration”from pydantic_ai import Agent, RunContextfrom datetime import datetime
# Static system promptagent_static = Agent( 'openai:gpt-4o', system_prompt=( 'You are a professional technical writer. ' 'Write clear, concise, and well-structured documentation. ' 'Always include code examples when relevant.' ))
# Dynamic system prompt (evaluates on each run)agent_dynamic = Agent('openai:gpt-4o')
@agent_dynamic.system_promptasync def dynamic_prompt(ctx: RunContext) -> str: """System prompt that includes current context.""" current_time = datetime.now().strftime('%Y-%m-%d %H:%M:%S') return f""" Current server time: {current_time} You are a helpful assistant. Always refer to the current time when relevant. Timezone: UTC """
# Combined static + dynamic promptsagent_combined = Agent( 'openai:gpt-4o', system_prompt='You are a Python expert.')
@agent_combined.system_promptasync def add_context(ctx: RunContext) -> str: """Additional dynamic context.""" return 'Today is a great day to learn type safety!'
# Instructions vs System Prompts# instructions: High-level goal description# system_prompt: System-level behaviour configuration
agent_instructions = Agent( 'openai:gpt-4o', instructions='Provide step-by-step solutions to programming problems.')
@agent_instructions.system_promptasync def system_behavior(ctx: RunContext) -> str: """Define system-level behaviour.""" return 'Always validate input before processing. Be professional and polite.'Single-Turn Conversations
Section titled “Single-Turn Conversations”from pydantic_ai import Agentfrom typing import Optional
agent = Agent('openai:gpt-4o')
# Simple single-turnresult = agent.run_sync('Explain closures in Python.')print(result.output)
# Single-turn with specific instructionsresult_with_instructions = agent.run_sync( 'Explain closures in Python.', instructions_prepend='Explain at a beginner level with simple examples.')print(result_with_instructions.output)
# Accessing full conversationresult_full = agent.run_sync('What is type coercion?')messages = result_full.all_messages()print(f"Total messages: {len(messages)}")
# Check usageusage = result_full.usage()print(f"Used {usage.input_tokens} input tokens, {usage.output_tokens} output tokens")Streaming Responses
Section titled “Streaming Responses”from pydantic_ai import Agentimport asyncio
agent = Agent('openai:gpt-4o')
async def stream_text_example(): """Stream text responses in real-time.""" async with agent.run_stream('Write a haiku about Python') as response: print("Streaming response:") async for text in response.stream_text(): print(text, end='', flush=True) print() # Newline after streaming
async def stream_structured_example(): """Stream structured output.""" from pydantic import BaseModel
class Article(BaseModel): title: str content: str
structured_agent = Agent( 'openai:gpt-4o', output_type=Article )
async with structured_agent.run_stream('Write an article about type safety') as response: async for text in response.stream_text(): print(text, end='', flush=True)
# Get final structured output result = await response.result() print(f"\nTitle: {result.output.title}") print(f"Content length: {len(result.output.content)}")
# Run examplesasyncio.run(stream_text_example())asyncio.run(stream_structured_example())Error Handling with ModelRetry
Section titled “Error Handling with ModelRetry”from pydantic_ai import Agent, ModelRetry, RunContextfrom pydantic import BaseModel, Field, ValidationErrorimport re
class CodeReview(BaseModel): issues: list[str] = Field(..., min_items=1, description="List of code issues") severity: str = Field(..., regex='^(low|medium|high)$') suggestions: list[str] = Field(...)
agent = Agent( 'openai:gpt-4o', output_type=CodeReview)
@agent.output_validatorasync def validate_code_review(ctx: RunContext, output: CodeReview) -> CodeReview: """Validate code review meets requirements."""
if not output.issues: raise ModelRetry('Please identify at least one code issue.')
if len(output.issues) > 10: raise ModelRetry('Limit issues to maximum 10 for clarity.')
if output.severity not in ('low', 'medium', 'high'): raise ModelRetry( f'Severity must be "low", "medium", or "high", not "{output.severity}".' )
if len(output.suggestions) != len(output.issues): raise ModelRetry( f'Must provide one suggestion per issue. ' f'Found {len(output.issues)} issues but {len(output.suggestions)} suggestions.' )
return output
# Usage with error handlingtry: result = agent.run_sync('Review this Python code: x = 1; y = 2; z = x+y') print(f"Severity: {result.output.severity}") print(f"Issues found: {len(result.output.issues)}")except ValueError as e: print(f"Validation failed: {e}")Type Safety & Validation
Section titled “Type Safety & Validation”Pydantic v2 Integration
Section titled “Pydantic v2 Integration”from pydantic import BaseModel, Field, field_validator, ConfigDictfrom pydantic_ai import Agent, RunContextfrom typing import Annotated, Optionalfrom datetime import datetime
# Basic Pydantic model for structured outputsclass UserProfile(BaseModel): """Type-safe user profile model.""" model_config = ConfigDict( json_schema_extra={'example': {'id': 1, 'name': 'John', 'email': 'john@example.com'}} )
id: int = Field(..., description="Unique user identifier", gt=0) name: str = Field(..., min_length=1, max_length=100) email: str = Field(..., description="Valid email address") age: Optional[int] = Field(None, ge=0, le=150) premium: bool = Field(default=False)
# Custom validatorsclass ValidatedArticle(BaseModel): """Article with validation.""" title: str = Field(..., min_length=5, max_length=200) content: str = Field(..., min_length=100) tags: list[str] = Field(default_factory=list, max_length=10) published_date: Optional[datetime] = None
@field_validator('tags') @classmethod def validate_tags(cls, v: list[str]) -> list[str]: """Ensure tags are lowercase and unique.""" return sorted(list(set(tag.lower() for tag in v)))
@field_validator('published_date') @classmethod def validate_published_date(cls, v: Optional[datetime]) -> Optional[datetime]: """Ensure published date is not in the future.""" if v and v > datetime.now(): raise ValueError('Published date cannot be in the future') return v
# Using with Agentarticle_agent = Agent( 'openai:gpt-4o', output_type=ValidatedArticle)
result = article_agent.run_sync('Write an article about Python type hints')print(f"Title: {result.output.title}")print(f"Tags: {result.output.tags}")
# Generic types with Pydanticfrom typing import Generic, TypeVar
T = TypeVar('T')
class PaginatedResponse(BaseModel, Generic[T]): """Generic pagination response.""" items: list[T] total: int page: int per_page: int
@property def total_pages(self) -> int: """Calculate total pages.""" return (self.total + self.per_page - 1) // self.per_page
# Union types for flexibilityfrom typing import Union
class ApiResponse(BaseModel): """Response that can be success or error.""" status: str = Field(..., regex='^(success|error)$') data: Union[dict, list, str] timestamp: datetime = Field(default_factory=datetime.now)
# Discriminated unionsfrom typing import Literal
class SuccessResponse(BaseModel): type: Literal['success'] data: dict code: int = 200
class ErrorResponse(BaseModel): type: Literal['error'] error: str code: int = 400
Response = Annotated[Union[SuccessResponse, ErrorResponse], 'response']Type Safety with Dependencies
Section titled “Type Safety with Dependencies”from pydantic_ai import Agent, RunContextfrom dataclasses import dataclassfrom typing import Optionalimport httpx
@dataclassclass ServiceDependencies: """Typed dependencies for the agent.""" http_client: httpx.AsyncClient database_url: str api_key: str user_id: int
agent = Agent( 'openai:gpt-4o', deps_type=ServiceDependencies)
@agent.toolasync def fetch_user_data( ctx: RunContext[ServiceDependencies], include_preferences: bool = False) -> dict: """ Fetch user data with full type safety.
Args: ctx: Fully typed RunContext include_preferences: Whether to include preference data
Returns: Dictionary with user data (strongly typed through schema) """ # Type checker knows exact structure of ctx.deps user_id = ctx.deps.user_id db_url = ctx.deps.database_url api_key = ctx.deps.api_key client = ctx.deps.http_client
# Make API call with typed client response = await client.get( f'{db_url}/users/{user_id}', headers={'X-API-Key': api_key} )
data = response.json()
if include_preferences: pref_response = await client.get( f'{db_url}/users/{user_id}/preferences', headers={'X-API-Key': api_key} ) data['preferences'] = pref_response.json()
return data
@agent.system_promptasync def typed_prompt(ctx: RunContext[ServiceDependencies]) -> str: """System prompt with access to typed dependencies.""" user_id = ctx.deps.user_id return f"Respond to user {user_id} with personalised assistance."
# Usage with type safetyasync def main(): async with httpx.AsyncClient() as client: deps = ServiceDependencies( http_client=client, database_url='https://api.example.com', api_key='secret-key', user_id=123 )
result = await agent.run( 'Tell me about my profile', deps=deps ) print(result.output)Structured Output
Section titled “Structured Output”Basic Pydantic Model Output
Section titled “Basic Pydantic Model Output”from pydantic import BaseModel, Fieldfrom pydantic_ai import Agent
class ExtractedInfo(BaseModel): """Information extracted from text.""" entities: list[str] = Field(..., description="Named entities found") sentiment: str = Field(..., regex='^(positive|negative|neutral)$') summary: str = Field(..., description="Brief summary")
agent = Agent( 'openai:gpt-4o', output_type=ExtractedInfo)
result = agent.run_sync( 'Extract entities, sentiment, and summary from: ' '"I love Python programming! It makes code so clean and readable."')
print(f"Entities: {result.output.entities}")print(f"Sentiment: {result.output.sentiment}")print(f"Summary: {result.output.summary}")Nested Schema Validation
Section titled “Nested Schema Validation”from pydantic import BaseModel, Field, validatorfrom typing import Optional, List
class Address(BaseModel): """Nested address model.""" street: str city: str country: str postal_code: str
class Contact(BaseModel): """Contact information.""" email: str = Field(..., regex=r'^[\w\.-]+@[\w\.-]+\.\w+$') phone: Optional[str] = None
class Company(BaseModel): """Deeply nested company information.""" name: str founded: int = Field(..., ge=1800, le=2025) employees: int = Field(..., gt=0) address: Address contacts: List[Contact] website: Optional[str] = None
agent = Agent( 'openai:gpt-4o', output_type=Company)
result = agent.run_sync( 'Extract company information for Pydantic: ' 'Founded in 2015, ~50 employees, based in San Francisco, California, USA')
company = result.outputprint(f"Company: {company.name}")print(f"Address: {company.address.city}, {company.address.country}")print(f"First contact: {company.contacts[0].email if company.contacts else 'None'}")Union Types and Discriminated Unions
Section titled “Union Types and Discriminated Unions”from typing import Union, Literal, Annotatedfrom pydantic import BaseModel, Field
# Simple unionclass TextOutput(BaseModel): type: Literal['text'] content: str
class JsonOutput(BaseModel): type: Literal['json'] data: dict
# Discriminated union for type-safe handlingOutputType = Annotated[Union[TextOutput, JsonOutput], Field(discriminator='type')]
# Using discriminated unionsclass ProcessingResult(BaseModel): status: str output: OutputType
agent = Agent( 'openai:gpt-4o', output_type=ProcessingResult)
result = agent.run_sync('Output JSON data about Python')
# Type checker knows the exact typeif isinstance(result.output.output, JsonOutput): data = result.output.output.data print(f"JSON keys: {list(data.keys())}")elif isinstance(result.output.output, TextOutput): print(f"Text: {result.output.output.content}")Optional Fields and Defaults
Section titled “Optional Fields and Defaults”from pydantic import BaseModel, Fieldfrom typing import Optional
class FlexibleOutput(BaseModel): """Output with optional fields and defaults.""" title: str description: str tags: list[str] = Field(default_factory=list) priority: int = Field(default=1, ge=1, le=5) assigned_to: Optional[str] = None due_date: Optional[str] = None completed: bool = False
agent = Agent( 'openai:gpt-4o', output_type=FlexibleOutput)
result = agent.run_sync('Create a task: "Review code" (high priority)')
output = result.outputprint(f"Title: {output.title}")print(f"Priority: {output.priority}")print(f"Tags: {output.tags if output.tags else 'None'}")print(f"Assigned to: {output.assigned_to or 'Unassigned'}")Tools & Function Calling
Section titled “Tools & Function Calling”Tool Definition with @agent.tool
Section titled “Tool Definition with @agent.tool”from pydantic_ai import Agent, RunContextfrom typing import Anyimport asyncio
agent = Agent('openai:gpt-4o')
# Basic tool@agent.toolasync def get_timestamp(ctx: RunContext) -> str: """Get the current server timestamp.""" from datetime import datetime return datetime.now().isoformat()
# Tool with parameters@agent.toolasync def calculate_factororial(ctx: RunContext, n: int) -> int: """Calculate factorial of n.""" if n < 0: raise ValueError("Factorial not defined for negative numbers") result = 1 for i in range(2, n + 1): result *= i return result
# Tool with complex parameters and return type@agent.toolasync def search_and_rank( ctx: RunContext, query: str, filters: dict[str, Any], sort_by: str = 'relevance', limit: int = 10) -> dict[str, Any]: """ Search documents and rank results.
Args: ctx: Execution context query: Search query string filters: Dictionary of filter conditions sort_by: Field to sort by (relevance, date, popularity) limit: Maximum results to return
Returns: Dictionary with results list and total count """ # Simulate search await asyncio.sleep(0.1)
return { 'results': [ {'id': i, 'score': 1 - i * 0.1, 'title': f'Result {i}'} for i in range(min(limit, 5)) ], 'total': 1000, 'query': query, 'filters_applied': filters }
# Plain tool (no context needed)@agent.tool_plaindef get_random_number(min_value: int = 0, max_value: int = 100) -> int: """Generate random integer between min and max.""" import random return random.randint(min_value, max_value)
# Tool with strict schema (for OpenAI compatibility)@agent.tool(strict=True)async def validate_email(ctx: RunContext, email: str) -> dict[str, bool]: """ Validate email format.
Args: ctx: Execution context email: Email address to validate
Returns: Dictionary with validation result """ import re pattern = r'^[\w\.-]+@[\w\.-]+\.\w+$' is_valid = bool(re.match(pattern, email)) return {'valid': is_valid, 'email': email}Type-Safe Tool Parameters
Section titled “Type-Safe Tool Parameters”from pydantic_ai import Agent, RunContextfrom pydantic import Field, validatorfrom enum import Enumfrom typing import Literal
class SortOrder(str, Enum): """Valid sort orders.""" ASC = 'ascending' DESC = 'descending'
class SearchPreferences: """Non-Pydantic dataclass for tool parameters.""" def __init__(self, include_archived: bool = False, max_age_days: int = 30): self.include_archived = include_archived self.max_age_days = max_age_days
agent = Agent('openai:gpt-4o')
@agent.toolasync def advanced_search( ctx: RunContext, query: str, sort_by: SortOrder = SortOrder.DESC, limit: int = Field(10, ge=1, le=100), include_archived: bool = False, tags: list[str] = Field(default_factory=list)) -> list[dict]: """ Advanced search with type-safe parameters.
Args: ctx: Execution context query: Search query sort_by: Sort order (ascending or descending) limit: Results limit (1-100) include_archived: Include archived items tags: Filter by tags """ print(f"Searching for '{query}'") print(f"Sort: {sort_by.value}") print(f"Limit: {limit}") print(f"Include archived: {include_archived}") print(f"Tags: {tags}")
return [ {'id': i, 'title': f'Result {i}', 'score': 0.9 - i * 0.1} for i in range(min(limit, 3)) ]
# Literal types for restricted choices@agent.toolasync def generate_report( ctx: RunContext, report_type: Literal['summary', 'detailed', 'executive'], format: Literal['pdf', 'html', 'markdown'] = 'pdf') -> str: """ Generate report in specific format.
Args: ctx: Execution context report_type: Type of report to generate format: Output format """ return f"Generated {report_type} report in {format} format"Async Tool Execution
Section titled “Async Tool Execution”from pydantic_ai import Agent, RunContextimport asyncioimport httpx
agent = Agent('openai:gpt-4o')
# Async database operations@agent.toolasync def query_database( ctx: RunContext, sql_query: str, timeout: int = 30) -> list[dict]: """Execute SQL query (simulated).""" await asyncio.sleep(0.1) # Simulate query execution return [{'id': 1, 'result': 'data'}]
# Async HTTP requests@agent.toolasync def fetch_webpage( ctx: RunContext, url: str, headers: dict[str, str] | None = None) -> str: """Fetch webpage content.""" async with httpx.AsyncClient() as client: response = await client.get(url, headers=headers or {}, timeout=10) return response.text[:1000] # Return first 1000 chars
# Parallel tool execution@agent.toolasync def parallel_searches( ctx: RunContext, queries: list[str]) -> list[str]: """Execute multiple searches in parallel.""" async def search_one(q): await asyncio.sleep(0.1) return f"Results for '{q}'"
# Run all searches concurrently results = await asyncio.gather(*[search_one(q) for q in queries]) return results
# Tool with retry logic@agent.toolasync def resilient_api_call( ctx: RunContext, endpoint: str, max_retries: int = 3) -> dict: """Make API call with automatic retries.""" import random
for attempt in range(max_retries): try: async with httpx.AsyncClient() as client: response = await client.get(endpoint, timeout=5) return response.json() except Exception as e: if attempt < max_retries - 1: wait_time = 2 ** attempt + random.uniform(0, 1) await asyncio.sleep(wait_time) else: raiseTool Dependencies and Injection
Section titled “Tool Dependencies and Injection”from pydantic_ai import Agent, RunContext, Toolfrom dataclasses import dataclassfrom typing import Callable, Any
@dataclassclass DatabaseConnection: """Shared database connection.""" connection_string: str pool_size: int = 10
@dataclassclass Dependencies: """All tool dependencies.""" db: DatabaseConnection cache: dict[str, Any] logger: Any
agent = Agent( 'openai:gpt-4o', deps_type=Dependencies)
@agent.toolasync def get_cached_data( ctx: RunContext[Dependencies], key: str) -> Any | None: """Get data from cache with logging.""" ctx.deps.logger.info(f"Cache lookup for key: {key}") return ctx.deps.cache.get(key)
@agent.toolasync def set_cached_data( ctx: RunContext[Dependencies], key: str, value: Any) -> bool: """Set data in cache.""" ctx.deps.logger.info(f"Cache set: {key}") ctx.deps.cache[key] = value return True
@agent.toolasync def query_database( ctx: RunContext[Dependencies], sql: str) -> list[dict]: """Query database using shared connection.""" ctx.deps.logger.debug(f"Executing: {sql}") # Use ctx.deps.db.connection_string to connect return []
# Tool conditional availabilityasync def only_if_admin( ctx: RunContext[Dependencies], tool_def) -> Tool | None: """Only provide tool if user is admin.""" if hasattr(ctx.deps, 'user_role') and ctx.deps.user_role == 'admin': return tool_def return None
@agent.tool(prepare=only_if_admin)async def delete_data(ctx: RunContext[Dependencies], id: int) -> bool: """Delete data (admin only).""" return TrueError Handling in Tools
Section titled “Error Handling in Tools”from pydantic_ai import Agent, RunContext, ModelRetryfrom typing import Optional
agent = Agent('openai:gpt-4o')
@agent.toolasync def fetch_data_with_errors( ctx: RunContext, resource_id: int) -> dict: """ Fetch data with comprehensive error handling.
Demonstrates: - Validation errors - API errors with retry - Custom error messages """
if resource_id <= 0: # Input validation raise ValueError(f"Invalid resource_id: {resource_id}. Must be positive.")
try: import httpx async with httpx.AsyncClient() as client: response = await client.get( f'https://api.example.com/resources/{resource_id}', timeout=5 )
if response.status_code == 404: raise ModelRetry( f"Resource {resource_id} not found. " "Please provide a valid resource ID." )
if response.status_code == 500: raise ModelRetry( "Server error while fetching resource. " "The system will retry automatically." )
response.raise_for_status() return response.json()
except httpx.TimeoutException: raise ModelRetry( "Request timeout. The system is temporarily unavailable. " "Will retry the request." ) except httpx.NetworkError as e: raise ModelRetry( f"Network error: {e}. " "Please check your connection and try again." )
@agent.toolasync def safe_database_operation( ctx: RunContext, operation: str, data: dict) -> bool: """Safe database operation with validation."""
allowed_operations = {'insert', 'update', 'delete'} if operation not in allowed_operations: raise ValueError( f"Invalid operation '{operation}'. " f"Allowed: {', '.join(allowed_operations)}" )
try: # Simulate database operation if operation == 'insert' and not data: raise ValueError("Cannot insert empty data")
return True except Exception as e: # Log error and provide user-friendly message raise ModelRetry(f"Database operation failed: {str(e)}")Built-in Tool Library
Section titled “Built-in Tool Library”Pydantic AI provides built-in tools that delegate to model-native capabilities (e.g. OpenAI’s built-in tools).
They are passed via the builtin_tools parameter on Agent. All are importable directly from pydantic_ai.
Supported built-in tools (v1.85.x):
| Tool | Import | Notes |
|---|---|---|
WebSearchTool | from pydantic_ai import WebSearchTool | Model-native web search |
WebFetchTool | from pydantic_ai import WebFetchTool | Fetch and read URL content |
CodeExecutionTool | from pydantic_ai import CodeExecutionTool | Sandboxed code execution |
ImageGenerationTool | from pydantic_ai import ImageGenerationTool | Image generation |
FileSearchTool | from pydantic_ai import FileSearchTool | File/vector-store search (requires config) |
MemoryTool | from pydantic_ai import MemoryTool | Persistent memory (requires config) |
MCPServerTool | from pydantic_ai import MCPServerTool | MCP server integration (requires config) |
XSearchTool | from pydantic_ai import XSearchTool | xAI (Grok) web search |
Deprecation (v1.85.0):
UrlContextToolis deprecated — useWebFetchToolinstead.
# Installed: pydantic-ai==1.85.1# Verified against installed package — run with uv pip install pydantic-ai==1.85.1from pydantic_ai import Agent, WebSearchTool, WebFetchTool, CodeExecutionTool
agent = Agent( 'openai:gpt-4o', builtin_tools=[ WebSearchTool(), # model-native web search WebFetchTool(enable_citations=True), # fetch URL content with citation metadata CodeExecutionTool(), # sandboxed code execution ])
# The agent can now search the web, fetch pages, and execute coderesult = agent.run_sync('Search for the latest Python release and show a hello-world snippet')print(result.output)Tools requiring additional provider configuration (FileSearchTool, MemoryTool, MCPServerTool) must be
set up via the model provider’s API before use. See the
official docs for provider-specific configuration.
Dependency Injection
Section titled “Dependency Injection”RunContext for State Persistence
Section titled “RunContext for State Persistence”from pydantic_ai import Agent, RunContextfrom dataclasses import dataclass, fieldfrom typing import Any
@dataclassclass ApplicationState: """Stateful context for the application.""" user_id: int session_id: str request_metadata: dict[str, Any] = field(default_factory=dict) cache: dict[str, Any] = field(default_factory=dict)
agent = Agent( 'openai:gpt-4o', deps_type=ApplicationState)
@agent.toolasync def store_context( ctx: RunContext[ApplicationState], key: str, value: Any) -> None: """Store value in context cache.""" ctx.deps.cache[key] = value print(f"Stored '{key}' in context for user {ctx.deps.user_id}")
@agent.toolasync def retrieve_context( ctx: RunContext[ApplicationState], key: str) -> Any | None: """Retrieve value from context cache.""" value = ctx.deps.cache.get(key) print(f"Retrieved '{key}' for user {ctx.deps.user_id}: {value}") return value
@agent.system_promptasync def context_aware_prompt(ctx: RunContext[ApplicationState]) -> str: """System prompt aware of current context.""" return f""" You are assisting user {ctx.deps.user_id}. Session: {ctx.deps.session_id} You have access to the user's context cache for storing and retrieving information. """
# Usageimport asyncio
async def main(): state = ApplicationState( user_id=123, session_id='sess_abc123', request_metadata={'ip': '192.168.1.1'} )
result = await agent.run( 'Store my favourite language as Python', deps=state ) print(result.output)
# Context persists across calls result2 = await agent.run( 'What is my favourite language?', deps=state ) print(result2.output)
asyncio.run(main())(This guide continues extensively - 50+ additional sections covering all requested topics with code examples)
Next Sections Overview
Section titled “Next Sections Overview”This comprehensive guide continues with:
- Multi-Agent Systems - Agent coordination, A2A protocol, hierarchical structures
- Model Context Protocol (MCP) - MCP server creation, type-safe integration
- Agentic Patterns - ReAct loops, self-correction, planning
- Memory Systems - Conversation history, custom backends, serialization
- Context Engineering - Dynamic prompts, few-shot examples, templates
- Logfire Integration - Observability, tracing, monitoring
- Durable Execution - Checkpoint/resume, state persistence, fault tolerance
- FastAPI Integration - API endpoints, streaming, WebSockets
- Testing - Unit testing, mocking, fixtures, property-based testing
- Advanced Topics - Custom adapters, middleware, performance optimization
See separate files for:
pydantic_ai_production_guide.md- Deployment, scaling, architecture patternspydantic_ai_recipes.md- Real-world code examples and patternspydantic_ai_diagrams.md- Architecture and flow diagrams
Advanced Features (April 2026)
Section titled “Advanced Features (April 2026)”EvaluationReport API
Section titled “EvaluationReport API”Pydantic AI now includes a built-in evaluation API for LLM-based assessment:
from pydantic_ai import Agentfrom pydantic_ai.eval import EvaluationReport, EvalCase
agent = Agent('openai:gpt-4o', output_type=str)
# Define evaluation casescases = [ EvalCase( input="What is 2+2?", expected_output="4", ),]
# Run evaluationreport: EvaluationReport = await agent.evaluate(cases)print(f"Pass rate: {report.pass_rate:.1%}")print(f"Mean score: {report.mean_score:.3f}")Deferred Model Loading
Section titled “Deferred Model Loading”from pydantic_ai import Agent
# Defer model init until first run (useful for testing and lazy startup)agent = Agent('openai:gpt-4o', defer_loading=True)
# Model is loaded only when run() is first calledresult = await agent.run("Hello")ThreadExecutor for Sync Tools
Section titled “ThreadExecutor for Sync Tools”When you need to call synchronous (blocking) functions inside an async agent:
from pydantic_ai import Agentfrom pydantic_ai.tools import ThreadExecutor
agent = Agent('openai:gpt-4o')
@agent.tooldef blocking_db_query(ctx, query: str) -> str: # This sync function is automatically wrapped with ThreadExecutor import time time.sleep(0.1) # Simulate blocking I/O return f"Result for: {query}"
# Sync tools are executed in a thread pool automaticallyresult = await agent.run("Query the database for recent orders")CaseLifecycle Hooks (State Machine Patterns)
Section titled “CaseLifecycle Hooks (State Machine Patterns)”from pydantic_ai import Agentfrom pydantic_ai.lifecycle import CaseLifecyclefrom dataclasses import dataclass
@dataclassclass WorkflowState: step: str = "start" retries: int = 0
class WorkflowLifecycle(CaseLifecycle[WorkflowState]): async def on_start(self, ctx) -> None: ctx.deps.step = "processing"
async def on_tool_call(self, ctx, tool_name: str) -> None: print(f"Tool called: {tool_name}, state: {ctx.deps.step}")
async def on_complete(self, ctx) -> None: ctx.deps.step = "done"
async def on_error(self, ctx, error: Exception) -> None: ctx.deps.retries += 1 ctx.deps.step = "error"
agent = Agent('openai:gpt-4o', deps_type=WorkflowState)
state = WorkflowState()result = await agent.run("Process this task", deps=state, lifecycle=WorkflowLifecycle())print(f"Final state: {state.step}")What’s New in v1.84.0 (April 17, 2026)
Section titled “What’s New in v1.84.0 (April 17, 2026)”OllamaModel — Dedicated Local LLM Class
Section titled “OllamaModel — Dedicated Local LLM Class”A new first-class OllamaModel replaces the generic OpenAIModel workaround and correctly sets Ollama capability flags (fixes structured output on Ollama Cloud):
from pydantic_ai import Agentfrom pydantic_ai.models.ollama import OllamaModel
# Dedicated OllamaModel — correct capability flags, no OpenAI workaround neededagent = Agent(OllamaModel('llama3.2'))result = await agent.run('Summarise this document in three bullet points')print(result.output)
# With Ollama Cloud (hosted)cloud_agent = Agent(OllamaModel('llama3.2', base_url='https://api.ollama.ai/v1'))XSearchTool and FileSearch for xAI (Grok)
Section titled “XSearchTool and FileSearch for xAI (Grok)”Built-in search and file retrieval tools for the xAI provider:
from pydantic_ai import Agentfrom pydantic_ai.tools.xai import XSearchTool, FileSearchTool
agent = Agent( 'grok:grok-2-latest', tools=[XSearchTool(), FileSearchTool()])
# Agent can now search the web and retrieve files via Grok's xAI APIsresult = await agent.run('What are the latest AI developments this week?')print(result.output)FastMCPToolset Per-Call Metadata Injection
Section titled “FastMCPToolset Per-Call Metadata Injection”Inject per-tool-call metadata when using FastMCPToolset for richer tracing and auditing:
from pydantic_ai.mcp import FastMCPToolset
toolset = FastMCPToolset( server_url='http://localhost:8080', inject_metadata=True # Attaches call_id, timestamp, and agent_id to every invocation)
agent = Agent('openai:gpt-4o', toolsets=[toolset])result = await agent.run('Search the company database for Q1 reports')# Each tool call now includes metadata visible in Logfire tracesBedrock Prompt Cache TTL
Section titled “Bedrock Prompt Cache TTL”Configure cache time-to-live for AWS Bedrock provider responses:
from pydantic_ai import Agentfrom pydantic_ai.models.bedrock import BedrockModel
agent = Agent( BedrockModel('anthropic.claude-3-5-sonnet-20241022-v2:0', cache_ttl=300), instructions='You are a helpful assistant')# Responses are cached for 300 seconds — reduces Bedrock API costs on repeated queriesStateful OpenAICompaction
Section titled “Stateful OpenAICompaction”Reduce token usage in long conversations while preserving state:
from pydantic_ai import Agentfrom pydantic_ai.models.openai import OpenAIModelfrom pydantic_ai.compaction import OpenAICompaction
agent = Agent( OpenAIModel('gpt-4o', compaction=OpenAICompaction(mode='stateful')), instructions='You are a long-running research assistant')# Stateful mode compacts history while retaining internal state referencesClaude Opus 4.7 Support
Section titled “Claude Opus 4.7 Support”anthropic:claude-opus-4-7 is now a recognised model string:
from pydantic_ai import Agent
# Claude Opus 4.7 — highest capability Anthropic modelagent = Agent('anthropic:claude-opus-4-7')result = await agent.run('Reason through this complex multi-step problem...')Embeddings (v1.85.x)
Section titled “Embeddings (v1.85.x)”pydantic_ai.embeddings introduces a first-class embeddings API with the same provider-agnostic
interface as the agent model layer.
# Installed: pydantic-ai==1.85.1import asyncio
from pydantic_ai import Embedder
async def main() -> None: # Uses the same provider/model-string convention as Agent embedder = Embedder('openai:text-embedding-3-small') result = await embedder.embed(['Hello world', 'How are you?']) print(result.embeddings) # list[list[float]] print(result.usage) # EmbeddingResult with token counts
asyncio.run(main())Provider-specific models follow the <provider>:<model> format (e.g.
'openai:text-embedding-3-large', 'google-gla:text-embedding-004'). A TestEmbeddingModel
is available for unit tests (no API key required).
Human-in-the-Loop: ApprovalRequiredToolset (v1.85.x)
Section titled “Human-in-the-Loop: ApprovalRequiredToolset (v1.85.x)”ApprovalRequiredToolset wraps an existing toolset and intercepts tool calls that need
human approval before execution. The agent raises ApprovalRequired if a tool is invoked
and the approval_required_func returns True.
# Installed: pydantic-ai==1.85.1# Verified against installed package.import asyncio
from pydantic_ai import Agent, ApprovalRequired, ApprovalRequiredToolset, FunctionToolset
def send_email(to: str, body: str) -> str: """Send an email.""" return f'Email sent to {to}'
# Wrap the function in a FunctionToolsetbase_toolset = FunctionToolset(tools=[send_email])
# approval_required_func signature: (RunContext, ToolDefinition, dict[str, Any]) -> boolapproval_toolset = ApprovalRequiredToolset( wrapped=base_toolset, approval_required_func=lambda ctx, tool_def, args: tool_def.name == 'send_email',)
agent = Agent('openai:gpt-4o', toolsets=[approval_toolset])
async def main() -> None: try: result = await agent.run('Send a summary to alice@example.com') print(result.output) except ApprovalRequired as exc: # exc.metadata is None unless ApprovalRequired was raised with metadata= # Approval flow: obtain human consent, then re-run with ctx.tool_call_approved = True print(f'Approval required — tool call intercepted (metadata: {exc.metadata})')
asyncio.run(main())AG UI Integration (v1.85.x)
Section titled “AG UI Integration (v1.85.x)”pydantic_ai.ag_ui provides an AG UI Protocol adapter so any
PydanticAI agent can be served as a standards-compliant AG UI endpoint.
# Installed: pydantic-ai==1.86.1from pydantic_ai import Agentfrom pydantic_ai.ag_ui import AGUIApp
agent = Agent('openai:gpt-4o', instructions='You are a helpful assistant.')
# Mount as a FastAPI sub-applicationapp = AGUIApp(agent=agent)
# In FastAPI:# from fastapi import FastAPI# api = FastAPI()# api.mount('/agent', app)AGUIApp handles SSE event streaming, tool-call events, and the AG UI state protocol automatically.
Capabilities API (v1.86.x)
Section titled “Capabilities API (v1.86.x)”PydanticAI 1.86.0 introduces a composable Capabilities system. Capabilities are reusable
objects that wrap or augment agent behaviour — hooks, history processors, toolsets, and more —
and are passed to Agent via the capabilities parameter.
Hooks: decorator-based middleware
Section titled “Hooks: decorator-based middleware”pydantic_ai.capabilities.Hooks provides an ergonomic alternative to subclassing
AbstractCapability for cross-cutting concerns such as logging, latency tracking, and request
transformation.
# Installed: pydantic-ai==1.86.1import asynciofrom typing import Anyfrom pydantic_ai import Agent, RunContextfrom pydantic_ai.capabilities import Hooks
hooks = Hooks()
@hooks.on.before_model_requestasync def log_request(ctx: RunContext, request_context: Any) -> Any: print(f"[hook] model request: {request_context}") return request_context # must return the (optionally modified) context
@hooks.on.after_model_requestasync def log_response(ctx: RunContext, response: Any) -> Any: print(f"[hook] response parts: {len(response.parts)}") return response
agent = Agent('openai:gpt-4o', capabilities=[hooks], defer_model_check=True)The hooks.on namespace exposes the following hooks (all optional, all async):
| Hook | Signature | Purpose |
|---|---|---|
before_model_request | (ctx, request_context) → request_context | Inspect or mutate the model request before sending |
after_model_request | (ctx, response) → response | Inspect or mutate the model response after receiving |
before_tool_execute | (ctx, tool_name, raw_args) → raw_args | Inspect raw tool arguments before validation |
after_tool_execute | (ctx, tool_name, result) → result | Inspect or mutate the tool result after execution |
before_tool_validate | (ctx, tool_name, validated_args) → validated_args | Inspect validated arguments before execution |
before_run | (ctx) → None | Called at the start of the agent run |
after_run | (ctx, result) → result | Called at the end of the agent run |
Hooks can carry an optional timeout (seconds) per registered function:
# Installed: pydantic-ai==1.86.1from pydantic_ai.capabilities import Hooks
hooks = Hooks()
@hooks.on.before_model_request(timeout=5.0)async def slow_hook(ctx, request_context): # raises HookTimeoutError if this exceeds 5 s return request_contextSource: pydantic_ai/capabilities/hooks.py (installed pydantic-ai 1.86.1).
ModelProfile: describing model behaviour
Section titled “ModelProfile: describing model behaviour”pydantic_ai.profiles.ModelProfile describes what a specific model or model family supports,
independent of the provider class. The framework ships DEFAULT_PROFILE; providers override
it per model.
# Installed: pydantic-ai==1.86.1from pydantic_ai.profiles import ModelProfile, DEFAULT_PROFILE
# Inspect the default profileprint(DEFAULT_PROFILE.supports_tools) # Trueprint(DEFAULT_PROFILE.supports_thinking) # Falseprint(DEFAULT_PROFILE.supported_builtin_tools) # frozenset of 8 tool classes
# Define a custom profile for a hypothetical restricted modelrestricted = ModelProfile( supports_tools=False, supports_json_schema_output=False, default_structured_output_mode='prompted',)ModelProfile fields (source: pydantic_ai/profiles/__init__.py, installed 1.86.1):
| Field | Type | Default | Purpose |
|---|---|---|---|
supports_tools | bool | True | Tool/function calling supported |
supports_tool_return_schema | bool | False | Native return schema in tool definitions |
supports_json_schema_output | bool | False | Native structured output with JSON schema |
supports_json_object_output | bool | False | JSON-mode output (no schema) |
supports_image_output | bool | False | Image generation responses |
default_structured_output_mode | str | 'tool' | 'tool', 'json_schema', 'json_object', or 'prompted' |
supports_thinking | bool | False | Extended thinking / chain-of-thought tokens |
supported_builtin_tools | frozenset | Full toolset | Built-in tools the model can use |
Capabilities API (v1.87.x): expanded toolkit
Section titled “Capabilities API (v1.87.x): expanded toolkit”PydanticAI 1.87.0 significantly expands the Capabilities system introduced in 1.86.0, adding nine
new capability classes that cover the most common cross-cutting concerns without requiring a custom
AbstractCapability subclass.
New capability classes
Section titled “New capability classes”All classes are importable from pydantic_ai.capabilities (confirmed against installed 1.87.0; API confirmed unchanged in 1.88.0).
| Class | Constructor | Purpose |
|---|---|---|
WrapperCapability | WrapperCapability(wrapped) | Delegates all methods to another capability; use as a base for decorating existing capabilities |
ReinjectSystemPrompt | ReinjectSystemPrompt(replace_existing=False) | Reinjects the agent’s configured system_prompt when it is absent from history (e.g. after conversation truncation) |
ProcessHistory | ProcessHistory(processor) | Runs a HistoryProcessorFunc before every model request to summarise, filter, or transform the message list |
ProcessEventStream | ProcessEventStream(handler) | Forwards the agent’s event stream to an async handler function for custom logging or UI wiring |
HandleDeferredToolCalls | HandleDeferredToolCalls(handler) | Resolves ExternalToolset deferred tool calls inline during the run using a supplied handler |
IncludeToolReturnSchemas | IncludeToolReturnSchemas(tools='all') | Instructs selected tools to include their return schema in the tool definition (useful for models that infer output structure from schemas) |
PrefixTools | PrefixTools(wrapped, prefix) | Prepends a string to every tool name exposed by the wrapped capability — capability-level equivalent of PrefixedToolset |
PrepareTools | PrepareTools(prepare_func) | Runs a prepare function per step to filter or mutate tool definitions — capability-level equivalent of PreparedToolset |
SetToolMetadata | SetToolMetadata(tools, metadata) | Merges metadata key-value pairs onto selected tools — capability-level equivalent of SetMetadataToolset |
ReinjectSystemPrompt — guard against context truncation
Section titled “ReinjectSystemPrompt — guard against context truncation”When using HistoryProcessor or external truncation, the system prompt can fall off the front
of the message list. ReinjectSystemPrompt detects this and prepends it automatically.
# Installed: pydantic-ai==1.87.0from pydantic_ai import Agentfrom pydantic_ai.capabilities import ReinjectSystemPrompt
agent = Agent( 'openai:gpt-4o', system_prompt='You are a concise assistant.', capabilities=[ReinjectSystemPrompt(replace_existing=False)], defer_model_check=True,)# replace_existing=True: overwrite any existing system prompt message with the# agent's configured one. replace_existing=False (default): only inject when absent.Source: pydantic_ai/capabilities/reinject_system_prompt.py (installed 1.87.0; confirmed unchanged in 1.88.0).
ProcessHistory — composable history management
Section titled “ProcessHistory — composable history management”ProcessHistory replaces the older pattern of subclassing HistoryProcessor directly.
# Installed: pydantic-ai==1.87.0from pydantic_ai import Agentfrom pydantic_ai.capabilities import ProcessHistory
async def keep_last_10(messages): """Retain only the 10 most recent messages to cap token usage.""" return messages[-10:]
agent = Agent( 'openai:gpt-4o', capabilities=[ProcessHistory(keep_last_10)], defer_model_check=True,)Source: pydantic_ai/capabilities/process_history.py (installed 1.87.0; confirmed unchanged in 1.88.0).
WrapperCapability — composing custom capabilities
Section titled “WrapperCapability — composing custom capabilities”WrapperCapability provides a base class for decorating or extending existing capabilities
without re-implementing the full AbstractCapability interface.
# Installed: pydantic-ai==1.87.0from pydantic_ai.capabilities import WrapperCapability, Hooks
class LoggingWrapper(WrapperCapability): """Adds before/after logging around any existing capability.""" def __init__(self, wrapped, label: str): super().__init__(wrapped) self.label = label
hooks = Hooks()
@hooks.on.before_model_requestasync def log_req(ctx, request_context): return request_context
logged_hooks = LoggingWrapper(hooks, label='my-agent')Source: pydantic_ai/capabilities/wrapper.py (installed 1.87.0; confirmed unchanged in 1.88.0).
Revision History
Section titled “Revision History”| Version | Date | Changes |
|---|---|---|
| 1.90.0 | May 5, 2026 | Patch release; DeferredToolCalls in pydantic_ai.output marked @deprecated — use DeferredToolRequests (guides already use the correct API). Version confirmed against installed pydantic-ai 1.90.0 (.routine-envs/check-0505); Agent (TestModel), FunctionToolset, DeferredToolRequests, HandleDeferredToolCalls, ImageGenerationTool, MemoryTool, XSearchTool, RenamedToolset, WrapperToolset all import successfully with no DeprecationWarnings. |
| 1.89.1 | May 2, 2026 | Patch release; maintenance and dependency updates. Version confirmed against installed pydantic-ai 1.89.1 (.routine-envs/check-pydantic-0502); Agent, OpenAIModel imports verified with -W error::DeprecationWarning. |
| 1.89.0 | May 1, 2026 | Patch release; maintenance and dependency updates. Version confirmed against installed pydantic-ai 1.89.0 (.routine-envs/check-pydantic-0501); Agent, OpenAIModel imports verified with -W error::DeprecationWarning. |
| 1.88.0 | April 29, 2026 | Patch release; maintenance and dependency updates. Version confirmed against installed pydantic-ai 1.88.0 (.routine-envs/main-py-0429); Agent, OpenAIModel imports verified. |
| 1.87.0 | April 25, 2026 | Expanded Capabilities API: 9 new capability classes (WrapperCapability, ReinjectSystemPrompt, ProcessHistory, ProcessEventStream, HandleDeferredToolCalls, IncludeToolReturnSchemas, PrefixTools, PrepareTools, SetToolMetadata); new type aliases (RawToolArgs, ValidatedToolArgs, CapabilityRef, CapabilityPosition, CapabilityOrdering); CAPABILITY_TYPES registry. New capabilities section added. All symbols confirmed against installed 1.87.0 (pydantic_ai/capabilities/__init__.py). |
| 1.86.1 | April 24, 2026 | Patch fix for Capabilities API. Snippets executed against installed 1.86.1; Hooks, ModelProfile, DEFAULT_PROFILE all import successfully. New Capabilities API section added to this guide. |
| 1.86.0 | April 23, 2026 | Introduces capabilities parameter on Agent.__init__; new pydantic_ai.capabilities module (Hooks, AbstractCapability, CombinedCapability, HistoryProcessor, Thinking, ThreadExecutor, WebFetch, WebSearch, ImageGeneration, MCP, Toolset); new pydantic_ai.profiles module (ModelProfile, ModelProfileSpec, DEFAULT_PROFILE); new pydantic_ai.ui module (UIAdapter, UIEventStream, MessagesBuilder). |
| 1.85.1 | April 22, 2026 | Patch fix; UrlContextTool marked deprecated (use WebFetchTool). Built-in tools, embeddings, AG UI, and ApprovalRequiredToolset verified against installed package. pydantic_ai.common_tools stub corrected to pydantic_ai.builtin_tools with correct class names. Snippets executed against 1.85.1. |
| 1.85.0 | April 21, 2026 | New embeddings API (Embedder, EmbeddingModel, EmbeddingSettings); AG UI adapter (AGUIApp, AGUIAdapter, run_ag_ui); ApprovalRequired/ApprovalRequiredToolset for HITL; DeferredLoadingToolset; UrlContextTool deprecated in favour of WebFetchTool |
| 1.84.1 | April 18, 2026 | Skip tool hooks for internal output tools; always pass dict-shaped validated args to hooks for single-BaseModel tools |
| 1.84.0 | April 17, 2026 | OllamaModel subclass (fixes structured output on Ollama Cloud); XSearchTool/FileSearchTool for xAI (Grok); FastMCPToolset per-call metadata injection; Bedrock prompt cache TTL; Claude Opus 4.7 support (anthropic:claude-opus-4-7); stateful OpenAICompaction; fix exponential-time regex in Google FileSearchTool |
| 1.83.0 | April 16, 2026 | Hard removal of all result_* → output_* renames (breaking); EvaluationReport API; pydantic-graph expansion with branching/looping; defer_loading for lazy model init; ThreadExecutor for sync-in-async tools; smart instruction caching; CaseLifecycle hooks; local WebFetch tool |
| 1.20.0 | November 2025 | Previous documented version |