azure-ai-agents Integration Add-on (Python) — Class Deep Dives Vol. 6
azure-ai-agents Integration Add-on (Python) — Class Deep Dives Vol. 6
Section titled “azure-ai-agents Integration Add-on (Python) — Class Deep Dives Vol. 6”Note:
azure-ai-agentsis an optional integration add-on for the Azure AI Agents service — not a replacement foragent-framework. See the integration overview for when to use it alongside the framework.
Package: azure-ai-agents (integration add-on)
Version covered: 1.1.0
Verified against: installed package at /usr/local/lib/python3.11/dist-packages/azure/ai/agents/
This is the sixth volume of source-verified class deep dives for the azure-ai-agents integration add-on. Earlier volumes covered the primary client, tool classes, orchestration patterns, data models, and streaming plumbing. This volume covers advanced configuration — the classes you reach for when you need fine-grained control: OpenAPI authentication strategies, Bing search tuning, run cost accounting, enterprise data sources, vector store lifecycle, forced tool selection, file-search result inspection, streaming event semantics, manual tool-call dispatch, and multimodal message construction.
Earlier volumes:
- Vol. 1 —
AgentsClient,FunctionTool,ToolSet,CodeInterpreterTool,FileSearchTool,BingGroundingTool,ConnectedAgentTool,AgentEventHandler,ThreadMessage,OpenApiTool - Vol. 3 —
AsyncFunctionTool,AzureFunctionTool,AzureAISearchTool,VectorStore,ThreadRun,RunStep,ResponseFormatJsonSchema,TruncationObject,MessageAttachment,AsyncAgentEventHandler - Vol. 4 —
AgentsClientauto function calls,FunctionTooldynamic registration,CodeInterpreterToolfile upload,FileSearchTool+VectorStorelifecycle,AzureAISearchToolquery modes,BingGroundingToolparams,ConnectedAgentToolmulti-agent,AsyncToolSet - Vol. 5 —
Agentmodel,AgentThread,ToolOutput,VectorStoreFileBatch,VectorStoreFile,FileInfo,MessageDeltaChunk,RunStepDeltaChunk,SubmitToolOutputsAction,AgentRunStream
Table of Contents
Section titled “Table of Contents”- OpenAPI auth hierarchy — anonymous, managed-identity, connection
BingGroundingSearchConfiguration+BingGroundingSearchToolParameters— fine-grained web searchRunCompletionUsage+RunStepCompletionUsage— cost accounting per run and per stepVectorStoreDataSource+VectorStoreConfigurations— enterprise Azure asset sources- Vector store lifecycle — expiry policy and chunking strategy
AgentsNamedToolChoice+AgentsToolChoiceOptionMode— forcing specific tools- File-search run step results —
RunStepFileSearchToolCallthroughFileSearchToolCallContent - Streaming event taxonomy —
AgentStreamEventand the four typed sub-enums RequiredFunctionToolCall+SubmitToolOutputsDetails— manual tool-call dispatch- Multimodal message input —
ThreadMessageOptions,MessageInputImageFileBlock,MessageInputImageUrlBlock
1. OpenAPI auth hierarchy — anonymous, managed-identity, connection
Section titled “1. OpenAPI auth hierarchy — anonymous, managed-identity, connection”Source: azure/ai/agents/models/_models.py
OpenApiTool (covered in Vol. 1) lets an agent call any REST API described by an OpenAPI spec. Authentication is configured via an OpenApiAuthDetails subclass selected using a discriminator field type. Three concrete subclasses exist.
Class signatures
Section titled “Class signatures”class OpenApiAuthType(str, Enum): ANONYMOUS = "anonymous" # no auth CONNECTION = "connection" # named AI Foundry connection MANAGED_IDENTITY = "managed_identity" # Azure managed identity
class OpenApiAnonymousAuthDetails(OpenApiAuthDetails): # No extra fields — discriminator sets type = "anonymous" automatically type: Literal[OpenApiAuthType.ANONYMOUS]
class OpenApiManagedSecurityScheme(_Model): audience: str # AAD scope, e.g. "https://management.azure.com/"
class OpenApiManagedAuthDetails(OpenApiAuthDetails): type: Literal[OpenApiAuthType.MANAGED_IDENTITY] security_scheme: OpenApiManagedSecurityScheme
class OpenApiConnectionSecurityScheme(_Model): connection_id: str # AI Foundry connection name / ID
class OpenApiConnectionAuthDetails(OpenApiAuthDetails): type: Literal[OpenApiAuthType.CONNECTION] security_scheme: OpenApiConnectionSecuritySchemeKey points
Section titled “Key points”OpenApiAnonymousAuthDetails()takes no arguments — just instantiate it and pass it toOpenApiFunctionDefinition.OpenApiManagedAuthDetailsrequires anaudiencestring — the AAD resource URI your function app or API is registered under. The managed identity of the Azure AI Agents service must be granted the appropriate role on that resource.OpenApiConnectionAuthDetailsreferences a connection you created in Azure AI Foundry (Project → Connections). Theconnection_idis the connection name shown in the portal.default_paramsonOpenApiFunctionDefinitionlets you specify parameter names that will be filled from a user-provided defaults dict at call time rather than generated by the model — useful for tenant-specific values you do not want in the prompt.
Example 1: public REST API with no authentication
Section titled “Example 1: public REST API with no authentication”import os, jsonfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( OpenApiTool, OpenApiFunctionDefinition, OpenApiAnonymousAuthDetails,)from azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
# Minimal OpenAPI 3.0 spec for a public weather APIweather_spec = { "openapi": "3.0.0", "info": {"title": "Weather API", "version": "1.0"}, "paths": { "/current": { "get": { "operationId": "get_current_weather", "summary": "Get current weather for a city", "parameters": [ { "name": "city", "in": "query", "required": True, "schema": {"type": "string"}, "description": "City name, e.g. 'London'", } ], "responses": {"200": {"description": "Weather data"}}, } } }, "servers": [{"url": "https://wttr.in"}],}
weather_tool = OpenApiTool( name="weather", spec=weather_spec, description="Get current weather for any city.", auth=OpenApiAnonymousAuthDetails(), # no auth needed)
agent = client.create_agent( model="gpt-4o", name="WeatherBot", instructions="Answer weather questions using the weather tool.", tools=weather_tool.definitions,)
thread = client.threads.create()client.messages.create( thread_id=thread.id, role="user", content="What's the weather in Tokyo right now?",)run = client.runs.create_and_process(thread_id=thread.id, agent_id=agent.id)for msg in client.messages.list(thread_id=thread.id): if msg.role == "assistant": print(msg.content[0].text.value)
client.delete_agent(agent.id)Example 2: internal Azure API with managed-identity auth
Section titled “Example 2: internal Azure API with managed-identity auth”from azure.ai.agents.models import ( OpenApiTool, OpenApiFunctionDefinition, OpenApiManagedAuthDetails, OpenApiManagedSecurityScheme,)
# Internal inventory API deployed as an Azure Function Appinventory_spec = { "openapi": "3.0.0", "info": {"title": "Inventory API", "version": "1.0"}, "paths": { "/stock/{sku}": { "get": { "operationId": "get_stock_level", "summary": "Return current stock for a SKU", "parameters": [ { "name": "sku", "in": "path", "required": True, "schema": {"type": "string"}, } ], "responses": {"200": {"description": "Stock level"}}, } } }, "servers": [{"url": "https://myapp.azurewebsites.net/api"}],}
# The audience is the App Registration client ID (or resource URI) for the Function Appauth = OpenApiManagedAuthDetails( security_scheme=OpenApiManagedSecurityScheme( audience="api://my-function-app-client-id" ))
inventory_tool = OpenApiTool( name="inventory", spec=inventory_spec, description="Check stock levels for product SKUs.", auth=auth,)Example 3: third-party API accessed via an AI Foundry connection
Section titled “Example 3: third-party API accessed via an AI Foundry connection”from azure.ai.agents.models import ( OpenApiTool, OpenApiConnectionAuthDetails, OpenApiConnectionSecurityScheme,)
# Connection "salesforce-prod" was created in AI Foundry Project → Connections# and stores the OAuth credentials securelysalesforce_tool = OpenApiTool( name="salesforce", spec=salesforce_spec, # your OpenAPI spec description="Query Salesforce CRM records.", auth=OpenApiConnectionAuthDetails( security_scheme=OpenApiConnectionSecurityScheme( connection_id="salesforce-prod" ) ),)Auth strategy comparison
Section titled “Auth strategy comparison”| Class | When to use | Azure requirement |
|---|---|---|
OpenApiAnonymousAuthDetails | Public APIs, internal APIs on private VNet | None |
OpenApiManagedAuthDetails | Azure-hosted APIs protected by AAD | Assign role on target resource to the agent’s managed identity |
OpenApiConnectionAuthDetails | Third-party APIs (OAuth, API key) via Foundry | Create a connection in Azure AI Foundry |
2. BingGroundingSearchConfiguration + BingGroundingSearchToolParameters — fine-grained web search
Section titled “2. BingGroundingSearchConfiguration + BingGroundingSearchToolParameters — fine-grained web search”Source: azure/ai/agents/models/_models.py
BingGroundingTool (Vol. 1) connects an agent to live web search. The parameters controlling how that search runs — locale, result count, time filter — live in BingGroundingSearchConfiguration. The container that holds one or more configurations is BingGroundingSearchToolParameters.
Signatures
Section titled “Signatures”class BingGroundingSearchConfiguration(_Model): connection_id: str # Bing connection in AI Foundry (required) market: Optional[str] # BCP-47 locale, e.g. "en-GB", "de-DE" set_lang: Optional[str] # UI string language for Bing API, e.g. "en" count: Optional[int] # number of search results to return freshness: Optional[str] # "Day" | "Week" | "Month" | "YYYY-MM-DD..YYYY-MM-DD"
class BingGroundingSearchToolParameters(_Model): search_configurations: List[BingGroundingSearchConfiguration] # Maximum 1 configuration per tool instanceKey points
Section titled “Key points”marketcontrols which regional index Bing uses."en-US"for the US index,"en-GB"for UK,"ja-JP"for Japan, and so on. This affects which news sources and pages rank highly.freshnessaccepts either a named window ("Day","Week","Month") or an ISO 8601 date range ("2026-01-01..2026-05-31"). Use this to restrict the agent to recent news or to a specific historical period.countcaps how many search result chunks Bing returns. Fewer results reduce context length and cost; more results improve coverage. Default is unset (Bing’s own default).BingGroundingSearchToolParameters.search_configurationsaccepts a list but the service currently enforces a maximum of one configuration per tool.BingGroundingToolexposes these parameters via itsbing_groundingproperty — you do not constructBingGroundingSearchToolParametersdirectly in normal use; it is populated fromBingGroundingTool’s constructor arguments.
Example 1: restrict search to the last 24 hours for breaking news
Section titled “Example 1: restrict search to the last 24 hours for breaking news”import osfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import BingGroundingToolfrom azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
bing_connection_id = os.environ["BING_CONNECTION_ID"]
# BingGroundingTool's constructor maps directly onto BingGroundingSearchConfiguration fieldsnews_tool = BingGroundingTool( connection_id=bing_connection_id, market="en-GB", # UK Bing index count=5, # limit to 5 results to keep context compact freshness="Day", # only results from the last 24 hours)
agent = client.create_agent( model="gpt-4o", name="NewsBot", instructions=( "You are a news assistant. Summarise only events from the last 24 hours. " "Always cite your sources." ), tools=news_tool.definitions, tool_resources=news_tool.resources,)
thread = client.threads.create()client.messages.create( thread_id=thread.id, role="user", content="What are the most important tech stories in the UK today?",)run = client.runs.create_and_process(thread_id=thread.id, agent_id=agent.id)
for msg in client.messages.list(thread_id=thread.id): if msg.role == "assistant": for block in msg.content: if hasattr(block, "text"): print(block.text.value)
client.delete_agent(agent.id)Example 2: historical date range search
Section titled “Example 2: historical date range search”# Research agent: UK AI policy between Jan and Mar 2026policy_tool = BingGroundingTool( connection_id=bing_connection_id, market="en-GB", freshness="2026-01-01..2026-03-31", # ISO date range count=10,)Example 3: inspect the generated parameters object
Section titled “Example 3: inspect the generated parameters object”from azure.ai.agents.models import ( BingGroundingTool, BingGroundingSearchToolParameters, BingGroundingSearchConfiguration,)
tool = BingGroundingTool( connection_id="my-bing-connection", market="en-US", count=8, freshness="Week",)
# The underlying parameters objectparams: BingGroundingSearchToolParameters = tool.bing_groundingfor cfg in params.search_configurations: print(f"connection: {cfg.connection_id}") print(f"market: {cfg.market}") print(f"count: {cfg.count}") print(f"freshness: {cfg.freshness}")3. RunCompletionUsage + RunStepCompletionUsage — cost accounting per run and per step
Section titled “3. RunCompletionUsage + RunStepCompletionUsage — cost accounting per run and per step”Source: azure/ai/agents/models/_models.py
Every ThreadRun carries a usage field of type RunCompletionUsage. Every RunStep carries a usage field of type RunStepCompletionUsage. Both expose prompt_tokens, completion_tokens, and total_tokens.
Signatures
Section titled “Signatures”class RunCompletionUsage(_Model): completion_tokens: int # output tokens for the whole run prompt_tokens: int # input tokens for the whole run total_tokens: int # prompt + completion
class RunStepCompletionUsage(_Model): completion_tokens: int # output tokens for this step prompt_tokens: int # input tokens for this step total_tokens: int # prompt + completionKey points
Section titled “Key points”usageisNonewhile the run is in a non-terminal state (queued,in_progress,cancelling). Always check before reading.RunCompletionUsageaggregates across all steps in the run — it is the sum you care about for billing.RunStepCompletionUsagelets you attribute cost to individual steps — useful for identifying expensive tool-call rounds vs. final message generation.- Token counts follow the model’s tokenizer. For GPT-4o, multiply by the per-token price in USD. For Azure OpenAI Foundry deployments the price per 1K tokens appears on your invoice under the model deployment name.
- There is no built-in rate guard in the SDK — implement your own budget check by accumulating
total_tokensacross runs.
Example 1: log usage after every run
Section titled “Example 1: log usage after every run”import osfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import RunCompletionUsage, RunStatusfrom azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
COST_PER_1K_INPUT = 0.0025 # USD — adjust to your model's pricingCOST_PER_1K_OUTPUT = 0.010
def log_run_cost(run) -> float: usage: RunCompletionUsage | None = run.usage if usage is None: print(f"Run {run.id} has no usage data (status: {run.status})") return 0.0 cost = ( usage.prompt_tokens / 1000 * COST_PER_1K_INPUT + usage.completion_tokens / 1000 * COST_PER_1K_OUTPUT ) print( f"Run {run.id} | " f"prompt={usage.prompt_tokens} " f"completion={usage.completion_tokens} " f"total={usage.total_tokens} " f"cost=${cost:.4f}" ) return cost
agent = client.create_agent( model="gpt-4o", name="CostTrackingBot", instructions="Answer questions concisely.",)
thread = client.threads.create()client.messages.create(thread_id=thread.id, role="user", content="Explain LLM tokenisation in one paragraph.")run = client.runs.create_and_process(thread_id=thread.id, agent_id=agent.id)log_run_cost(run)client.delete_agent(agent.id)Example 2: per-step cost breakdown
Section titled “Example 2: per-step cost breakdown”from azure.ai.agents.models import RunStepCompletionUsage, RunStepType
steps = client.run_steps.list(thread_id=thread.id, run_id=run.id)for step in steps: usage: RunStepCompletionUsage | None = step.usage if usage: print( f" Step {step.id} [{step.type}] — " f"prompt={usage.prompt_tokens} " f"completion={usage.completion_tokens}" )Example 3: budget guard
Section titled “Example 3: budget guard”MAX_TOKENS_PER_SESSION = 50_000session_tokens = 0
def run_with_budget(client, thread_id: str, agent_id: str, user_message: str) -> str: global session_tokens if session_tokens >= MAX_TOKENS_PER_SESSION: raise RuntimeError("Session token budget exhausted.")
client.messages.create(thread_id=thread_id, role="user", content=user_message) run = client.runs.create_and_process(thread_id=thread_id, agent_id=agent_id)
if run.usage: session_tokens += run.usage.total_tokens print(f"Session tokens used: {session_tokens}/{MAX_TOKENS_PER_SESSION}")
messages = client.messages.list(thread_id=thread_id) return next( (m.content[0].text.value for m in messages if m.role == "assistant"), "" )4. VectorStoreDataSource + VectorStoreConfigurations — enterprise Azure asset sources
Section titled “4. VectorStoreDataSource + VectorStoreConfigurations — enterprise Azure asset sources”Source: azure/ai/agents/models/_models.py
Standard file upload (via client.files.upload) ingests files directly into the agents service. For enterprise workloads where your documents already live in Azure Blob Storage or Azure Data Lake Gen 2, use VectorStoreDataSource to reference them by URI or asset ID — no re-upload required.
Signatures
Section titled “Signatures”class VectorStoreDataSourceAssetType(str, Enum): URI_ASSET = "uri_asset" # Azure Storage URI (abfss://, https://) ID_ASSET = "id_asset" # Azure ML data asset ID
class VectorStoreDataSource(_Model): asset_identifier: str # URI or data asset ID asset_type: VectorStoreDataSourceAssetType # "uri_asset" or "id_asset"
class VectorStoreConfiguration(_Model): data_sources: List[VectorStoreDataSource] # one or more sources
class VectorStoreConfigurations(_Model): store_name: str # logical name of the store store_configuration: VectorStoreConfigurationKey points
Section titled “Key points”URI_ASSETacceptsabfss://paths (Azure Data Lake Gen 2) andhttps://blob storage URLs. The agents service’s managed identity (or the connection credential) must have at least Storage Blob Data Reader on the container.ID_ASSETreferences an Azure Machine Learning data asset by its ID. Useful when your data pipeline publishes versioned datasets to AML.VectorStoreConfigurationsis the top-level wrapper you pass totool_resourcesonFileSearchTool. It names the store and contains the configuration.- The
store_nameinVectorStoreConfigurationsis a logical label — it does not need to match any storage container name. - You can include multiple
VectorStoreDataSourceentries in a singleVectorStoreConfigurationto aggregate documents from several containers or paths.
Example 1: index a Data Lake path directly
Section titled “Example 1: index a Data Lake path directly”import osfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( FileSearchTool, VectorStoreDataSource, VectorStoreDataSourceAssetType, VectorStoreConfiguration, VectorStoreConfigurations,)from azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
data_source = VectorStoreDataSource( asset_identifier="abfss://documents@mystorageaccount.dfs.core.windows.net/policies/", asset_type=VectorStoreDataSourceAssetType.URI_ASSET,)
vs_config = VectorStoreConfigurations( store_name="policy-docs", store_configuration=VectorStoreConfiguration(data_sources=[data_source]),)
file_search = FileSearchTool(vector_store_configurations=[vs_config])
agent = client.create_agent( model="gpt-4o", name="PolicyBot", instructions="Answer questions about company policies. Only use the provided documents.", tools=file_search.definitions, tool_resources=file_search.resources,)Example 2: aggregate documents from multiple containers
Section titled “Example 2: aggregate documents from multiple containers”from azure.ai.agents.models import ( VectorStoreDataSource, VectorStoreDataSourceAssetType, VectorStoreConfiguration, VectorStoreConfigurations, FileSearchTool,)
hr_source = VectorStoreDataSource( asset_identifier="abfss://hr@mystorageaccount.dfs.core.windows.net/handbooks/", asset_type=VectorStoreDataSourceAssetType.URI_ASSET,)legal_source = VectorStoreDataSource( asset_identifier="abfss://legal@mystorageaccount.dfs.core.windows.net/contracts/", asset_type=VectorStoreDataSourceAssetType.URI_ASSET,)
vs_config = VectorStoreConfigurations( store_name="combined-knowledge", store_configuration=VectorStoreConfiguration( data_sources=[hr_source, legal_source] ),)
tool = FileSearchTool(vector_store_configurations=[vs_config])Example 3: Azure ML data asset by ID
Section titled “Example 3: Azure ML data asset by ID”from azure.ai.agents.models import ( VectorStoreDataSource, VectorStoreDataSourceAssetType, VectorStoreConfiguration, VectorStoreConfigurations, FileSearchTool,)
# AML data asset ID — found in your AML workspace under Data → Assetsaml_source = VectorStoreDataSource( asset_identifier="/subscriptions/sub-id/resourceGroups/rg/providers/" "Microsoft.MachineLearningServices/workspaces/my-ws/" "data/my-dataset/versions/3", asset_type=VectorStoreDataSourceAssetType.ID_ASSET,)
tool = FileSearchTool( vector_store_configurations=[ VectorStoreConfigurations( store_name="aml-dataset-v3", store_configuration=VectorStoreConfiguration(data_sources=[aml_source]), ) ])5. Vector store lifecycle — expiry policy and chunking strategy
Section titled “5. Vector store lifecycle — expiry policy and chunking strategy”Source: azure/ai/agents/models/_models.py
Two configuration concerns matter for long-lived vector stores: when do they expire and how are documents chunked. The first is handled by VectorStoreExpirationPolicy; the second by VectorStoreChunkingStrategyRequest and its two subclasses.
Signatures
Section titled “Signatures”class VectorStoreExpirationPolicyAnchor(str, Enum): LAST_ACTIVE_AT = "last_active_at" # expiry counts from last use
class VectorStoreExpirationPolicy(_Model): anchor: VectorStoreExpirationPolicyAnchor # always "last_active_at" days: int # days until expiry after anchor event
class VectorStoreAutoChunkingStrategyRequest(VectorStoreChunkingStrategyRequest): # Default strategy: max_chunk_size_tokens=800, chunk_overlap_tokens=400 # No configurable fields — just instantiate it
class VectorStoreStaticChunkingStrategyOptions(_Model): max_chunk_size_tokens: int # 100–4096, default 800 chunk_overlap_tokens: int # must be < max_chunk_size_tokens / 2, default 400
class VectorStoreStaticChunkingStrategyRequest(VectorStoreChunkingStrategyRequest): static: VectorStoreStaticChunkingStrategyOptionsKey points
Section titled “Key points”- The only supported expiry anchor is
"last_active_at"— the TTL resets every time the vector store is queried. A store will only expire if it goes unused fordaysconsecutive days. dayshas no enforced minimum or maximum in the SDK, but the service enforces a maximum of 365 days. Usedays=7for ephemeral session data,days=365for long-lived knowledge bases.VectorStoreAutoChunkingStrategyRequest(default) is equivalent toVectorStoreStaticChunkingStrategyRequest(static=VectorStoreStaticChunkingStrategyOptions(max_chunk_size_tokens=800, chunk_overlap_tokens=400)). Useautounless you have a specific reason to tune chunking.- For legal or technical documents with long paragraphs, increase
max_chunk_size_tokensto1600or2048to avoid splitting mid-sentence. chunk_overlap_tokensmust be strictly less thanmax_chunk_size_tokens / 2. The service will reject a configuration where the overlap is ≥ half the chunk size.- Chunking strategy is set at vector store creation time and cannot be changed afterwards. Recreate the store if you need different chunking.
Example 1: ephemeral session store (expires after 1 day of inactivity)
Section titled “Example 1: ephemeral session store (expires after 1 day of inactivity)”import osfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( VectorStoreExpirationPolicy, VectorStoreExpirationPolicyAnchor, VectorStoreAutoChunkingStrategyRequest,)from azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
# Create a short-lived store for a support sessionstore = client.vector_stores.create_and_poll( name="session-docs", expires_after=VectorStoreExpirationPolicy( anchor=VectorStoreExpirationPolicyAnchor.LAST_ACTIVE_AT, days=1, ), chunking_strategy=VectorStoreAutoChunkingStrategyRequest(),)print(f"Store '{store.name}' expires in 1 day of inactivity (status: {store.status})")Example 2: static chunking for long-form technical documentation
Section titled “Example 2: static chunking for long-form technical documentation”from azure.ai.agents.models import ( VectorStoreStaticChunkingStrategyRequest, VectorStoreStaticChunkingStrategyOptions, VectorStoreExpirationPolicy, VectorStoreExpirationPolicyAnchor,)
# Technical manuals: large chunks, moderate overlapstore = client.vector_stores.create_and_poll( name="technical-manuals", expires_after=VectorStoreExpirationPolicy( anchor=VectorStoreExpirationPolicyAnchor.LAST_ACTIVE_AT, days=90, ), chunking_strategy=VectorStoreStaticChunkingStrategyRequest( static=VectorStoreStaticChunkingStrategyOptions( max_chunk_size_tokens=1600, chunk_overlap_tokens=200, # must be < 1600 / 2 = 800 ) ),)Example 3: reading back strategy from a created store
Section titled “Example 3: reading back strategy from a created store”from azure.ai.agents.models import ( VectorStoreStaticChunkingStrategyResponse, VectorStoreAutoChunkingStrategyResponse,)
store = client.vector_stores.get(vector_store_id=store.id)strategy = store.chunking_strategy # returns a VectorStoreChunkingStrategyResponse
if isinstance(strategy, VectorStoreStaticChunkingStrategyResponse): print(f"max_chunk_size: {strategy.static.max_chunk_size_tokens}") print(f"overlap: {strategy.static.chunk_overlap_tokens}")elif isinstance(strategy, VectorStoreAutoChunkingStrategyResponse): print("Auto chunking (800 / 400 defaults)")6. AgentsNamedToolChoice + AgentsToolChoiceOptionMode — forcing specific tools
Section titled “6. AgentsNamedToolChoice + AgentsToolChoiceOptionMode — forcing specific tools”Source: azure/ai/agents/models/_models.py
By default the model decides which tool (if any) to call. Two model-level controls let you override this: tool_choice on create_run / create_thread_and_run, and its two value shapes.
Signatures
Section titled “Signatures”class AgentsToolChoiceOptionMode(str, Enum): NONE = "none" # model must not call any tool — text only AUTO = "auto" # model chooses freely (the default)
class AgentsNamedToolChoiceType(str, Enum): FUNCTION = "function" CODE_INTERPRETER = "code_interpreter" FILE_SEARCH = "file_search" BING_GROUNDING = "bing_grounding" AZURE_AI_SEARCH = "azure_ai_search" CONNECTED_AGENT = "connected_agent"
class FunctionName(_Model): name: str # exact function name to force
class AgentsNamedToolChoice(_Model): type: AgentsNamedToolChoiceType function: Optional[FunctionName] # required when type == "function"Key points
Section titled “Key points”tool_choiceaccepts either a string (AgentsToolChoiceOptionModevalue) or anAgentsNamedToolChoiceinstance. The API accepts both via a union type.- Setting
tool_choice=AgentsToolChoiceOptionMode.NONEforces the model to produce a text response regardless of whether tools are attached. Useful for a “summarise the conversation” final step where you do not want side-effects. AgentsNamedToolChoice(type="file_search")forces the model to always invoke file search first, before deciding whether to generate text — useful for RAG workflows where you always want retrieval.- When
type=AgentsNamedToolChoiceType.FUNCTIONyou must also setfunction=FunctionName(name="my_function"). For built-in tools (code_interpreter,file_search, etc.) thefunctionfield is omitted. - Forcing a tool does not prevent the model from calling other tools in subsequent steps. It only forces the first tool call.
Example 1: force file search on every run
Section titled “Example 1: force file search on every run”import osfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( FileSearchTool, AgentsNamedToolChoice, AgentsNamedToolChoiceType,)from azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
vector_store_id = "vs_your_store_id"file_search = FileSearchTool(vector_store_ids=[vector_store_id])
agent = client.create_agent( model="gpt-4o", name="RAGBot", instructions="Always ground your answers in the document store.", tools=file_search.definitions, tool_resources=file_search.resources,)
thread = client.threads.create()client.messages.create(thread_id=thread.id, role="user", content="What does the policy say about remote work?")
# Force file_search — model must retrieve before answeringrun = client.runs.create_and_process( thread_id=thread.id, agent_id=agent.id, tool_choice=AgentsNamedToolChoice(type=AgentsNamedToolChoiceType.FILE_SEARCH),)
client.delete_agent(agent.id)Example 2: force a specific function call
Section titled “Example 2: force a specific function call”from azure.ai.agents.models import ( FunctionTool, AgentsNamedToolChoice, AgentsNamedToolChoiceType, FunctionName,)
def get_account_balance(account_id: str) -> str: """Return current account balance.""" return f"£1,234.56 for account {account_id}"
tool = FunctionTool({get_account_balance})
agent = client.create_agent( model="gpt-4o", name="BalanceBot", instructions="Help customers check their balance.", tools=tool.definitions,)
thread = client.threads.create()client.messages.create(thread_id=thread.id, role="user", content="What's my balance?")
# Force the model to call get_account_balance (it still needs to pick the argument)run = client.runs.create_and_process( thread_id=thread.id, agent_id=agent.id, tool_choice=AgentsNamedToolChoice( type=AgentsNamedToolChoiceType.FUNCTION, function=FunctionName(name="get_account_balance"), ),)Example 3: text-only final step
Section titled “Example 3: text-only final step”from azure.ai.agents.models import AgentsToolChoiceOptionMode
# After several tool-use rounds, force a clean text summaryclient.messages.create( thread_id=thread.id, role="user", content="Summarise everything we've discussed and any actions taken.",)run = client.runs.create_and_process( thread_id=thread.id, agent_id=agent.id, tool_choice=AgentsToolChoiceOptionMode.NONE, # no tools allowed)7. File-search run step results — RunStepFileSearchToolCall through FileSearchToolCallContent
Section titled “7. File-search run step results — RunStepFileSearchToolCall through FileSearchToolCallContent”Source: azure/ai/agents/models/_models.py
When an agent uses FileSearchTool, each retrieval is recorded as a RunStepFileSearchToolCall run step. Inspecting these gives you the raw search results with relevance scores — useful for debugging retrieval quality and for building “show your sources” UI.
Signatures
Section titled “Signatures”class FileSearchToolCallContent(_Model): type: Literal["text"] # always "text" text: str # the retrieved passage
class FileSearchRankingOptions(_Model): ranker: str # ranker identifier used score_threshold: float # minimum score for inclusion
class RunStepFileSearchToolCallResult(_Model): file_id: str # ID of the source file file_name: str # human-readable filename score: float # relevance score 0.0–1.0 content: Optional[List[FileSearchToolCallContent]] # passage text (opt-in)
class RunStepFileSearchToolCallResults(_Model): ranking_options: Optional[FileSearchRankingOptions] results: List[RunStepFileSearchToolCallResult]
class RunStepFileSearchToolCall(RunStepToolCall): type: Literal["file_search"] id: str file_search: RunStepFileSearchToolCallResults
class RunAdditionalFieldList(str, Enum): FILE_SEARCH_CONTENTS = "step_details.tool_calls[*].file_search.results[*].content"Key points
Section titled “Key points”RunStepFileSearchToolCallResult.contentisNoneby default. To receive the actual retrieved text you must passinclude=[RunAdditionalFieldList.FILE_SEARCH_CONTENTS]when listing run steps.scoreis a float between 0.0 and 1.0. Higher scores indicate stronger semantic relevance to the query.RunStepFileSearchToolCallResults.ranking_optionsis set when the agent’sFileSearchToolDefinitionDetailshadranking_optionsconfigured at agent creation time.- A single run step of type
tool_callsmay contain multipleRunStepFileSearchToolCallentries if the agent called file search more than once in that step. - File search results are read-only snapshots. The
file_idvalues link back to files in your vector store.
Example 1: list run steps and print retrieval scores
Section titled “Example 1: list run steps and print retrieval scores”import osfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( FileSearchTool, RunStepType, RunStepFileSearchToolCall, RunAdditionalFieldList,)from azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
vector_store_id = "vs_your_store_id"tool = FileSearchTool(vector_store_ids=[vector_store_id])
agent = client.create_agent( model="gpt-4o", name="SearchInspector", instructions="Answer questions using the document store.", tools=tool.definitions, tool_resources=tool.resources,)
thread = client.threads.create()client.messages.create( thread_id=thread.id, role="user", content="What are the refund conditions?",)run = client.runs.create_and_process(thread_id=thread.id, agent_id=agent.id)
# Request file search result content via the include parametersteps = client.run_steps.list( thread_id=thread.id, run_id=run.id, include=[RunAdditionalFieldList.FILE_SEARCH_CONTENTS],)
for step in steps: if step.type == RunStepType.TOOL_CALLS: for tool_call in step.step_details.tool_calls: if isinstance(tool_call, RunStepFileSearchToolCall): print(f"\nFile search step {step.id}:") for result in tool_call.file_search.results: print(f" [{result.score:.3f}] {result.file_name} ({result.file_id})") if result.content: for chunk in result.content: print(f" → {chunk.text[:120]}…")
client.delete_agent(agent.id)Example 2: filter low-quality results in post-processing
Section titled “Example 2: filter low-quality results in post-processing”MIN_SCORE = 0.6
def good_results(tool_call: RunStepFileSearchToolCall): return [ r for r in tool_call.file_search.results if r.score >= MIN_SCORE ]Example 3: build a “Sources” UI section from results
Section titled “Example 3: build a “Sources” UI section from results”def format_sources(tool_calls) -> str: sources = [] for tc in tool_calls: if isinstance(tc, RunStepFileSearchToolCall): for r in tc.file_search.results: sources.append(f"- **{r.file_name}** (relevance {r.score:.0%})") return "\n".join(sources) if sources else "_No sources retrieved._"8. Streaming event taxonomy — AgentStreamEvent and the four typed sub-enums
Section titled “8. Streaming event taxonomy — AgentStreamEvent and the four typed sub-enums”Source: azure/ai/agents/models/_models.py
The streaming layer dispatches server-sent events to AgentEventHandler callbacks. Each SSE carries an event string matching one of the enum values documented here. Knowing the full taxonomy lets you write type-safe event handlers and handle unknown future events gracefully.
Enumerations
Section titled “Enumerations”class ThreadStreamEvent(str, Enum): THREAD_CREATED = "thread.created" # data: AgentThread
class RunStreamEvent(str, Enum): THREAD_RUN_CREATED = "thread.run.created" THREAD_RUN_QUEUED = "thread.run.queued" THREAD_RUN_IN_PROGRESS = "thread.run.in_progress" THREAD_RUN_REQUIRES_ACTION = "thread.run.requires_action" THREAD_RUN_COMPLETED = "thread.run.completed" THREAD_RUN_INCOMPLETE = "thread.run.incomplete" THREAD_RUN_FAILED = "thread.run.failed" THREAD_RUN_CANCELLING = "thread.run.cancelling" THREAD_RUN_CANCELLED = "thread.run.cancelled" THREAD_RUN_EXPIRED = "thread.run.expired"
class MessageStreamEvent(str, Enum): THREAD_MESSAGE_CREATED = "thread.message.created" THREAD_MESSAGE_IN_PROGRESS = "thread.message.in_progress" THREAD_MESSAGE_DELTA = "thread.message.delta" # data: MessageDeltaChunk THREAD_MESSAGE_COMPLETED = "thread.message.completed" THREAD_MESSAGE_INCOMPLETE = "thread.message.incomplete"
class RunStepStreamEvent(str, Enum): THREAD_RUN_STEP_CREATED = "thread.run.step.created" THREAD_RUN_STEP_IN_PROGRESS = "thread.run.step.in_progress" THREAD_RUN_STEP_DELTA = "thread.run.step.delta" # data: RunStepDeltaChunk THREAD_RUN_STEP_COMPLETED = "thread.run.step.completed" THREAD_RUN_STEP_FAILED = "thread.run.step.failed" THREAD_RUN_STEP_CANCELLED = "thread.run.step.cancelled" THREAD_RUN_STEP_EXPIRED = "thread.run.step.expired"
# AgentStreamEvent is a flat union that includes every value above plus:class AgentStreamEvent(str, Enum): # All thread/run/message/step events are repeated here, plus: DONE = "done" # stream has ended — data is "[DONE]" ERROR = "error" # stream error — data is an error objectKey points
Section titled “Key points”AgentStreamEventis the master union: every event string from the four typed sub-enums also appears inAgentStreamEvent. Use the typed sub-enums in match/isinstance guards where you care about a specific category; useAgentStreamEventfor the full list.AgentStreamEvent.DONE(“done”) marks the end of the SSE stream. TheAgentEventHandler.on_done()hook is called at this point.AgentStreamEvent.ERROR(“error”) is dispatched when the service sends an error SSE. TheAgentEventHandler.on_error()hook receives the rawErrorEventobject.- The
datapayload type for each event is documented in the enum member docstring — e.g.,THREAD_MESSAGE_DELTAcarriesMessageDeltaChunk, whileTHREAD_RUN_COMPLETEDcarriesThreadRun. - Microsoft recommends handling unknown event names gracefully (fall through without raising) since new events may be added.
Example 1: type-safe streaming handler using all four sub-enums
Section titled “Example 1: type-safe streaming handler using all four sub-enums”import osfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( AgentEventHandler, ThreadRun, RunStep, ThreadMessage, MessageDeltaChunk, RunStepDeltaChunk, RunStreamEvent, MessageStreamEvent, RunStepStreamEvent,)from azure.identity import DefaultAzureCredential
class VerboseEventHandler(AgentEventHandler): def on_thread_run(self, run: ThreadRun) -> None: # Matches RunStreamEvent.THREAD_RUN_* events print(f"[RUN ] {run.id} → {run.status}")
def on_message_delta(self, delta: MessageDeltaChunk) -> None: # Matches MessageStreamEvent.THREAD_MESSAGE_DELTA for block in delta.delta.content or []: if hasattr(block, "text") and block.text: print(block.text.value, end="", flush=True)
def on_run_step(self, step: RunStep) -> None: # Matches RunStepStreamEvent.THREAD_RUN_STEP_CREATED / COMPLETED / FAILED … print(f"\n[STEP] {step.id} [{step.type}] → {step.status}")
def on_run_step_delta(self, delta: RunStepDeltaChunk) -> None: # Matches RunStepStreamEvent.THREAD_RUN_STEP_DELTA pass # handle code-interpreter streaming here if needed
def on_message(self, message: ThreadMessage) -> None: # Matches MessageStreamEvent.THREAD_MESSAGE_CREATED / COMPLETED … print(f"\n[MSG ] {message.id} status={message.status}")
def on_done(self) -> None: print("\n[DONE] Stream closed.")
def on_error(self, data: str) -> None: print(f"[ERR ] {data}")
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
agent = client.create_agent( model="gpt-4o", name="StreamBot", instructions="Be concise.",)thread = client.threads.create()client.messages.create(thread_id=thread.id, role="user", content="Count to 5.")
with client.runs.stream( thread_id=thread.id, agent_id=agent.id, event_handler_class=VerboseEventHandler,) as handler: handler.until_done()
client.delete_agent(agent.id)Example 2: detect requires-action inside a streaming run
Section titled “Example 2: detect requires-action inside a streaming run”from azure.ai.agents.models import ( AgentEventHandler, ThreadRun, RunStatus, ToolOutput,)
class ToolCallHandler(AgentEventHandler): def __init__(self, client, tool_fn): super().__init__() self.client = client self.tool_fn = tool_fn
def on_thread_run(self, run: ThreadRun) -> None: if run.status == RunStatus.REQUIRES_ACTION: action = run.required_action outputs = [] for tc in action.submit_tool_outputs.tool_calls: import json args = json.loads(tc.function.arguments) result = self.tool_fn(tc.function.name, **args) outputs.append(ToolOutput(tool_call_id=tc.id, output=str(result))) self.client.runs.submit_tool_outputs_stream( thread_id=run.thread_id, run_id=run.id, tool_outputs=outputs, event_handler=self, )9. RequiredFunctionToolCall + SubmitToolOutputsDetails — manual tool-call dispatch
Section titled “9. RequiredFunctionToolCall + SubmitToolOutputsDetails — manual tool-call dispatch”Source: azure/ai/agents/models/_models.py
When enable_auto_function_calls is not used (see Vol. 4), the run enters REQUIRES_ACTION status and you must call client.runs.submit_tool_outputs. The data model for that action flows through three classes.
Signatures
Section titled “Signatures”class RequiredFunctionToolCallDetails(_Model): name: str # function name as registered in FunctionTool arguments: str # JSON string of arguments generated by the model
class RequiredFunctionToolCall(RequiredToolCall): type: Literal["function"] id: str # tool_call_id to echo back function: RequiredFunctionToolCallDetails
class SubmitToolOutputsDetails(_Model): tool_calls: List[RequiredToolCall] # list of calls that need outputs
# On ThreadRun, when status == "requires_action":# run.required_action.type == "submit_tool_outputs"# run.required_action.submit_tool_outputs is a SubmitToolOutputsDetailsKey points
Section titled “Key points”RequiredFunctionToolCallDetails.argumentsis a JSON string — not a dict. Parse it withjson.loads()before calling your function.- The
idfield onRequiredFunctionToolCallmust be included verbatim asToolOutput.tool_call_idwhen you submit results. If the IDs don’t match the run will fail. SubmitToolOutputsDetails.tool_callsis aList[RequiredToolCall]— the base type. In practice every entry will be aRequiredFunctionToolCallfor function tools, but other tool types may appear in future.- You must submit outputs for all pending tool calls in a single
submit_tool_outputscall. Partial submission is not supported. - After submitting, the run re-enters
IN_PROGRESSand may produce more tool calls before finally completing. Loop untilrun.statusis a terminal state.
Example 1: full manual dispatch loop
Section titled “Example 1: full manual dispatch loop”import os, jsonfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( FunctionTool, RunStatus, ToolOutput, RequiredFunctionToolCall, SubmitToolOutputsDetails,)from azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
def get_weather(city: str) -> str: """Return the weather for a city.""" return f"The weather in {city} is sunny and 22°C."
def book_restaurant(city: str, cuisine: str, time: str) -> str: """Book a restaurant.""" return f"Booked a {cuisine} restaurant in {city} at {time}. Ref: RES-0042."
tool = FunctionTool({get_weather, book_restaurant})
agent = client.create_agent( model="gpt-4o", name="PlannerBot", instructions="Help users plan outings. Check the weather and book restaurants.", tools=tool.definitions,)
thread = client.threads.create()client.messages.create( thread_id=thread.id, role="user", content="Is it a good day for lunch outside in Paris? Book somewhere Italian at 1 PM.",)
# Start the run — do NOT use create_and_process; we drive the loop ourselvesrun = client.runs.create(thread_id=thread.id, agent_id=agent.id)
TERMINAL = {RunStatus.COMPLETED, RunStatus.FAILED, RunStatus.CANCELLED, RunStatus.EXPIRED}
while run.status not in TERMINAL: run = client.runs.get(thread_id=thread.id, run_id=run.id)
if run.status == RunStatus.REQUIRES_ACTION: details: SubmitToolOutputsDetails = run.required_action.submit_tool_outputs outputs = []
for tc in details.tool_calls: if isinstance(tc, RequiredFunctionToolCall): args = json.loads(tc.function.arguments) print(f" → calling {tc.function.name}({args})")
if tc.function.name == "get_weather": result = get_weather(**args) elif tc.function.name == "book_restaurant": result = book_restaurant(**args) else: result = "Function not found."
outputs.append(ToolOutput(tool_call_id=tc.id, output=result))
run = client.runs.submit_tool_outputs( thread_id=thread.id, run_id=run.id, tool_outputs=outputs, )
print(f"\nRun finished: {run.status}")for msg in client.messages.list(thread_id=thread.id): if msg.role == "assistant": print(msg.content[0].text.value)
client.delete_agent(agent.id)Example 2: generic dispatcher using a function registry
Section titled “Example 2: generic dispatcher using a function registry”from typing import Callableimport jsonfrom azure.ai.agents.models import RequiredFunctionToolCall, ToolOutput
def dispatch_tool_calls( tool_calls, registry: dict[str, Callable],) -> list[ToolOutput]: outputs = [] for tc in tool_calls: if not isinstance(tc, RequiredFunctionToolCall): continue fn = registry.get(tc.function.name) if fn is None: result = f"Unknown function: {tc.function.name}" else: try: args = json.loads(tc.function.arguments) result = str(fn(**args)) except Exception as exc: result = f"Error: {exc}" outputs.append(ToolOutput(tool_call_id=tc.id, output=result)) return outputs10. Multimodal message input — ThreadMessageOptions, MessageInputImageFileBlock, MessageInputImageUrlBlock
Section titled “10. Multimodal message input — ThreadMessageOptions, MessageInputImageFileBlock, MessageInputImageUrlBlock”Source: azure/ai/agents/models/_models.py
Thread messages can contain more than plain text. Vision-capable models (GPT-4o, GPT-4 Turbo) accept image content alongside text. The SDK exposes this via a discriminated union of MessageInputContentBlock subclasses.
Signatures
Section titled “Signatures”class ImageDetailLevel(str, Enum): AUTO = "auto" # server picks based on image size LOW = "low" # fast, lower resolution — 85 tokens HIGH = "high" # slow, full resolution — up to 1105 tokens
class MessageInputTextBlock(MessageInputContentBlock): type: Literal["text"] text: str
class MessageImageFileParam(_Model): file_id: str # previously uploaded file ID detail: Optional[ImageDetailLevel] # defaults to "auto"
class MessageInputImageFileBlock(MessageInputContentBlock): type: Literal["image_file"] image_file: MessageImageFileParam
class MessageImageUrlParam(_Model): url: str # publicly accessible URL detail: Optional[ImageDetailLevel] # defaults to "auto"
class MessageInputImageUrlBlock(MessageInputContentBlock): type: Literal["image_url"] image_url: MessageImageUrlParam
class ThreadMessageOptions(_Model): role: MessageRole # "user" or "assistant" content: str | List[MessageInputContentBlock] # content is a plain string for text-only messages; # a list of blocks for multimodal messages.Key points
Section titled “Key points”ImageDetailLevel.LOWprocesses images at a fixed 512×512 resolution crop. It costs 85 tokens regardless of image size and is fastest.ImageDetailLevel.HIGHprocesses the image in 512×512 tiles. A 1080×1080 image costs approximately 765 tokens on top of the base 85.ImageDetailLevel.AUTO(default) lets the server pick based on the image dimensions — it defaults tolowfor small images andhighfor large ones.- When using
MessageInputImageFileBlock, the file must have been uploaded viaclient.files.uploadwithpurpose=FilePurpose.VISION(or equivalent). The file ID is what you pass. MessageInputImageUrlBlockaccepts any publicly accessible URL. The model will fetch and encode the image server-side. Private URLs behind auth are not supported via this path — upload the file first and useMessageInputImageFileBlockinstead.ThreadMessageOptions.contentis a plainstrfor text-only messages. Passing a list ofMessageInputContentBlockenables multimodal input. The API accepts both shapes at the samecontentfield.
Example 1: send a URL image alongside a question
Section titled “Example 1: send a URL image alongside a question”import osfrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( MessageInputTextBlock, MessageInputImageUrlBlock, MessageImageUrlParam, ImageDetailLevel,)from azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
agent = client.create_agent( model="gpt-4o", name="VisionBot", instructions="Describe what you see in images and answer follow-up questions.",)
thread = client.threads.create()
# Multimodal message: text + image URLclient.messages.create( thread_id=thread.id, role="user", content=[ MessageInputTextBlock(text="What type of chart is shown, and what are the main trends?"), MessageInputImageUrlBlock( image_url=MessageImageUrlParam( url="https://example.com/sales_chart_2026.png", detail=ImageDetailLevel.HIGH, # full resolution for charts ) ), ],)
run = client.runs.create_and_process(thread_id=thread.id, agent_id=agent.id)
for msg in client.messages.list(thread_id=thread.id): if msg.role == "assistant": print(msg.content[0].text.value)
client.delete_agent(agent.id)Example 2: upload a private image and reference it by file ID
Section titled “Example 2: upload a private image and reference it by file ID”import osfrom azure.ai.agents.models import ( FilePurpose, MessageInputTextBlock, MessageInputImageFileBlock, MessageImageFileParam, ImageDetailLevel,)
# Upload the image (purpose must allow vision)with open("/tmp/diagram.png", "rb") as f: uploaded = client.files.upload(file=f, purpose=FilePurpose.ASSISTANTS)
thread = client.threads.create()client.messages.create( thread_id=thread.id, role="user", content=[ MessageInputTextBlock(text="Identify all components in this architecture diagram."), MessageInputImageFileBlock( image_file=MessageImageFileParam( file_id=uploaded.id, detail=ImageDetailLevel.HIGH, ) ), ],)Example 3: multiple images in one message
Section titled “Example 3: multiple images in one message”client.messages.create( thread_id=thread.id, role="user", content=[ MessageInputTextBlock(text="Compare these two product designs and give a preference."), MessageInputImageUrlBlock( image_url=MessageImageUrlParam( url="https://example.com/design_a.png", detail=ImageDetailLevel.AUTO, ) ), MessageInputImageUrlBlock( image_url=MessageImageUrlParam( url="https://example.com/design_b.png", detail=ImageDetailLevel.AUTO, ) ), ],)Image token cost reference
Section titled “Image token cost reference”| Detail level | Tiles | Base cost | Typical 1080×1080 cost |
|---|---|---|---|
low | 1 (fixed crop) | 85 tokens | 85 tokens |
high | depends on dimensions | 85 + 170/tile | ~765 tokens |
auto | server-chosen | — | equivalent to low or high |
Capstone: production pipeline combining all six volumes
Section titled “Capstone: production pipeline combining all six volumes”The following example shows how the classes from Vols. 1–6 compose into a realistic production pipeline: a research agent that indexes Data Lake documents, enforces a token budget, forces file search, streams results, extracts source citations, and handles tool calls manually.
import os, json, asynciofrom azure.ai.agents import AgentsClientfrom azure.ai.agents.models import ( # Tool setup (Vols. 1–4) FileSearchTool, BingGroundingTool, FunctionTool, # Azure data sources (Vol. 6, section 4) VectorStoreDataSource, VectorStoreDataSourceAssetType, VectorStoreConfiguration, VectorStoreConfigurations, # Expiry + chunking (Vol. 6, section 5) VectorStoreExpirationPolicy, VectorStoreExpirationPolicyAnchor, VectorStoreStaticChunkingStrategyRequest, VectorStoreStaticChunkingStrategyOptions, # Tool choice (Vol. 6, section 6) AgentsNamedToolChoice, AgentsNamedToolChoiceType, # Streaming (Vols. 5–6) AgentEventHandler, ThreadRun, MessageDeltaChunk, RunStep, # File search results (Vol. 6, section 7) RunStepFileSearchToolCall, RunAdditionalFieldList, # Tool dispatch (Vol. 6, section 9) RequiredFunctionToolCall, ToolOutput, RunStatus, # Usage (Vol. 6, section 3) RunCompletionUsage, # Auth (Vol. 6, section 1) OpenApiAnonymousAuthDetails,)from azure.identity import DefaultAzureCredential
client = AgentsClient( endpoint=os.environ["AZURE_AI_AGENTS_ENDPOINT"], credential=DefaultAzureCredential(),)
# 1. Create vector store from Azure Data Lake (Vol. 6, sections 4–5)store = client.vector_stores.create_and_poll( name="research-docs", expires_after=VectorStoreExpirationPolicy( anchor=VectorStoreExpirationPolicyAnchor.LAST_ACTIVE_AT, days=30, ), chunking_strategy=VectorStoreStaticChunkingStrategyRequest( static=VectorStoreStaticChunkingStrategyOptions( max_chunk_size_tokens=1200, chunk_overlap_tokens=200, ) ),)
data_source = VectorStoreDataSource( asset_identifier="abfss://research@myaccount.dfs.core.windows.net/papers/", asset_type=VectorStoreDataSourceAssetType.URI_ASSET,)vs_configs = VectorStoreConfigurations( store_name="research-papers", store_configuration=VectorStoreConfiguration(data_sources=[data_source]),)
file_search = FileSearchTool( vector_store_ids=[store.id], vector_store_configurations=[vs_configs],)
bing = BingGroundingTool( connection_id=os.environ["BING_CONNECTION_ID"], freshness="Week", count=5,)
def save_note(topic: str, content: str) -> str: """Save a research note.""" print(f" [NOTE] {topic}: {content[:80]}…") return "Saved."
note_tool = FunctionTool({save_note})
agent = client.create_agent( model="gpt-4o", name="ResearchAgent", instructions=( "You are a thorough research assistant. Always search documents first, " "then supplement with web search for recent developments. " "Save key insights using save_note." ), tools=[*file_search.definitions, *bing.definitions, *note_tool.definitions], tool_resources=file_search.resources,)
# 2. Streaming handler (Vol. 6, section 8)class ResearchHandler(AgentEventHandler): def on_message_delta(self, delta: MessageDeltaChunk) -> None: for block in delta.delta.content or []: if hasattr(block, "text") and block.text: print(block.text.value, end="", flush=True)
def on_run_step(self, step: RunStep) -> None: print(f"\n[STEP] {step.type} → {step.status}")
def on_thread_run(self, run: ThreadRun) -> None: print(f"\n[RUN ] {run.status}")
def on_done(self) -> None: print("\n[DONE]")
# 3. Run with forced file search first (Vol. 6, section 6)thread = client.threads.create()client.messages.create( thread_id=thread.id, role="user", content="What does the research say about transformer attention patterns? Include recent 2026 findings.",)
with client.runs.stream( thread_id=thread.id, agent_id=agent.id, tool_choice=AgentsNamedToolChoice(type=AgentsNamedToolChoiceType.FILE_SEARCH), event_handler_class=ResearchHandler,) as handler: handler.until_done()
# 4. Inspect sources and token usage (Vol. 6, sections 3 and 7)final_run = client.runs.get(thread_id=thread.id, run_id=handler.current_run_id)if final_run.usage: usage: RunCompletionUsage = final_run.usage print( f"\nTokens — prompt: {usage.prompt_tokens} " f"completion: {usage.completion_tokens} " f"total: {usage.total_tokens}" )
steps = client.run_steps.list( thread_id=thread.id, run_id=final_run.id, include=[RunAdditionalFieldList.FILE_SEARCH_CONTENTS],)print("\nSources:")for step in steps: if hasattr(step.step_details, "tool_calls"): for tc in step.step_details.tool_calls: if isinstance(tc, RunStepFileSearchToolCall): for r in tc.file_search.results: print(f" [{r.score:.2f}] {r.file_name}")
client.delete_agent(agent.id)client.vector_stores.delete(vector_store_id=store.id)Quick-reference table
Section titled “Quick-reference table”| Class | Module | First covered | Key use case |
|---|---|---|---|
OpenApiAnonymousAuthDetails | models | Vol. 6 | Public API — no auth |
OpenApiManagedAuthDetails | models | Vol. 6 | Azure-hosted API — managed identity |
OpenApiConnectionAuthDetails | models | Vol. 6 | Third-party API — Foundry connection |
OpenApiManagedSecurityScheme | models | Vol. 6 | AAD audience for managed identity |
OpenApiConnectionSecurityScheme | models | Vol. 6 | Connection ID for Foundry auth |
BingGroundingSearchConfiguration | models | Vol. 6 | Market, freshness, result count |
BingGroundingSearchToolParameters | models | Vol. 6 | Container for search configs |
RunCompletionUsage | models | Vol. 6 | Token cost per run |
RunStepCompletionUsage | models | Vol. 6 | Token cost per step |
VectorStoreDataSource | models | Vol. 6 | Azure asset URI or ID |
VectorStoreDataSourceAssetType | models | Vol. 6 | uri_asset vs id_asset |
VectorStoreConfiguration | models | Vol. 6 | List of data sources |
VectorStoreConfigurations | models | Vol. 6 | Named store + config wrapper |
VectorStoreExpirationPolicy | models | Vol. 6 | TTL: days since last use |
VectorStoreStaticChunkingStrategyRequest | models | Vol. 6 | Fine-grained chunking |
VectorStoreStaticChunkingStrategyOptions | models | Vol. 6 | max_chunk_size_tokens, overlap |
VectorStoreAutoChunkingStrategyRequest | models | Vol. 6 | Default 800/400 chunking |
AgentsNamedToolChoice | models | Vol. 6 | Force a specific tool |
AgentsNamedToolChoiceType | models | Vol. 6 | Tool type enum |
AgentsToolChoiceOptionMode | models | Vol. 6 | none / auto mode |
FunctionName | models | Vol. 6 | Named function for forced call |
RunStepFileSearchToolCall | models | Vol. 6 | File search step record |
RunStepFileSearchToolCallResults | models | Vol. 6 | Results container + ranking |
RunStepFileSearchToolCallResult | models | Vol. 6 | Score + filename + content |
FileSearchRankingOptions | models | Vol. 6 | Ranker + threshold |
FileSearchToolCallContent | models | Vol. 6 | Retrieved text passage |
RunAdditionalFieldList | models | Vol. 6 | include= for file content |
AgentStreamEvent | models | Vol. 6 | Master event taxonomy |
RunStreamEvent | models | Vol. 6 | Run lifecycle events |
MessageStreamEvent | models | Vol. 6 | Message lifecycle events |
RunStepStreamEvent | models | Vol. 6 | Step lifecycle events |
ThreadStreamEvent | models | Vol. 6 | Thread created event |
RequiredFunctionToolCall | models | Vol. 6 | Tool call to dispatch |
RequiredFunctionToolCallDetails | models | Vol. 6 | Name + arguments JSON |
SubmitToolOutputsDetails | models | Vol. 6 | Pending calls container |
ThreadMessageOptions | models | Vol. 6 | Rich / multimodal message |
MessageInputTextBlock | models | Vol. 6 | Text block in message |
MessageInputImageFileBlock | models | Vol. 6 | Uploaded image by file ID |
MessageInputImageUrlBlock | models | Vol. 6 | External image by URL |
MessageImageFileParam | models | Vol. 6 | File ID + detail level |
MessageImageUrlParam | models | Vol. 6 | URL + detail level |
ImageDetailLevel | models | Vol. 6 | auto / low / high |