Microsoft Agent Framework (Python) — Observability & Telemetry
Observability & Telemetry — Python
Section titled “Observability & Telemetry — Python”agent-framework-core emits OpenTelemetry signals following the GenAI semantic conventions. Every agent run, model call, tool invocation, workflow executor, and edge group is a span; every chat completion emits a duration histogram and a token-usage histogram. Nothing is exported by default — you opt in either by calling a helper, setting one env var, or wiring your own OTel providers.
Verified against agent-framework-core==1.2.2 (agent_framework.observability).
Three ways to turn it on
Section titled “Three ways to turn it on”Pick exactly one. Mixing them causes duplicate providers.
1. configure_otel_providers() — batteries included
Section titled “1. configure_otel_providers() — batteries included”Auto-wires OTLP exporters from standard OTel env vars and enables instrumentation in one call.
from agent_framework.observability import configure_otel_providers
configure_otel_providers()Set the usual OTel env vars before calling:
export ENABLE_INSTRUMENTATION=trueexport OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317export OTEL_EXPORTER_OTLP_PROTOCOL=grpcexport OTEL_SERVICE_NAME=my-agentFor local debugging, route everything to stdout:
configure_otel_providers(enable_console_exporters=True)2. Bring your own providers — just enable instrumentation
Section titled “2. Bring your own providers — just enable instrumentation”When your app already owns OTel setup (Azure Monitor, Datadog, Honeycomb agents), skip configure_otel_providers and call enable_instrumentation():
from azure.monitor.opentelemetry import configure_azure_monitorfrom agent_framework.observability import enable_instrumentation
configure_azure_monitor(connection_string=AZURE_MONITOR_CONNECTION)enable_instrumentation()3. Custom exporters, keeping the framework’s providers
Section titled “3. Custom exporters, keeping the framework’s providers”Pass your own exporters and optional Views; framework-managed providers are still created:
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporterfrom opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporterfrom opentelemetry.sdk.metrics.view import Viewfrom agent_framework.observability import configure_otel_providers, create_metric_views
configure_otel_providers( exporters=[ OTLPSpanExporter(endpoint="http://otel-collector:4317"), OTLPLogExporter(endpoint="http://otel-collector:4317"), ], views=create_metric_views(), # default GenAI + agent_framework views)What you get
Section titled “What you get”| Span name | Source | Key attributes |
|---|---|---|
invoke_agent | Agent.run() | gen_ai.agent.name, gen_ai.agent.id, gen_ai.conversation.id |
chat | Every chat-client call | gen_ai.request.model, gen_ai.response.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons |
execute_tool | Each tool invocation | gen_ai.tool.name, gen_ai.tool.call.id, gen_ai.tool.call.arguments, gen_ai.tool.call.result |
workflow.build / workflow.run | Workflow lifecycle | workflow.name, workflow.id |
executor.process | Each executor run | executor.id, executor.type |
edge_group.process | Each edge traversal | edge_group.id, edge_group.type, edge_group.delivery_status |
message.send | Cross-executor messaging | message.source_id, message.target_id, message.type |
All attribute constants live in agent_framework.observability.OtelAttr — use them to build span processors and samplers rather than typing strings.
Metrics
Section titled “Metrics”Two histograms emit automatically once instrumentation is on:
| Metric | Unit | Dimensions |
|---|---|---|
gen_ai.client.operation.duration | seconds | gen_ai.operation.name, gen_ai.request.model, gen_ai.response.model, gen_ai.provider.name, error.type |
gen_ai.client.token.usage | tokens | Same + gen_ai.token.type (input | output) |
Plus agent_framework.function.invocation.duration per tool call.
Agent-framework forwards to the standard Python logging module. Configure via dictConfig:
import logginglogging.basicConfig(level=logging.INFO)logging.getLogger("agent_framework").setLevel(logging.DEBUG)When instrumentation is enabled, logs flow through OTel’s LoggingHandler and ship alongside traces/metrics.
Sensitive-data events
Section titled “Sensitive-data events”Prompts and completions are redacted by default. Flip enable_sensitive_data=True to include full message bodies as span events — useful in dev, avoid in production.
configure_otel_providers(enable_sensitive_data=True)# or: export ENABLE_SENSITIVE_DATA=trueVS Code AI Toolkit / Foundry extension
Section titled “VS Code AI Toolkit / Foundry extension”The AI Toolkit and Azure AI Foundry VS Code extensions listen on a local port for live traces. Point the framework at it:
configure_otel_providers(vs_code_extension_port=4317)# or: export VS_CODE_EXTENSION_PORT=4317Open the “AI Toolkit: Traces” view — every agent run, tool call, and workflow edge shows up with inputs, outputs, and timing.
Custom spans around business logic
Section titled “Custom spans around business logic”You can still wrap your own code in spans; they slot neatly into the framework’s tree.
from opentelemetry import trace
tracer = trace.get_tracer("myapp.customer")
with tracer.start_as_current_span("process_order", attributes={"order.id": order_id}): response = await agent.run(f"Draft reply for order {order_id}")For agent-level wrapping use AgentMiddleware instead — see the middleware page.
Use the framework’s tracer / meter helpers
Section titled “Use the framework’s tracer / meter helpers”agent_framework.observability.get_tracer() and get_meter() return providers that share the same configuration the framework uses internally. Calling them ensures your custom spans land alongside invoke_agent / chat / execute_tool and inherit the same resource attributes (service name, version):
from agent_framework.observability import get_tracer, get_meter
tracer = get_tracer("myapp.pricing")meter = get_meter("myapp.pricing")
cache_hits = meter.create_counter( "myapp.pricing.cache_hits", description="Pricing cache hit count", unit="{request}",)
async def price_quote(sku: str, *, cache, agent): with tracer.start_as_current_span("price_quote", attributes={"sku": sku}): if cached := cache.get(sku): cache_hits.add(1, {"sku.tier": cached.tier}) return cached return await agent.run(f"Price quote for {sku}")Reading framework attributes with OtelAttr
Section titled “Reading framework attributes with OtelAttr”OtelAttr is the single source of truth for every span attribute the framework emits. Use it to filter, sample, or post-process spans without typing string keys:
from opentelemetry.context import Contextfrom opentelemetry.sdk.trace import ReadableSpan, Spanfrom opentelemetry.sdk.trace.export import SpanProcessorfrom agent_framework.observability import OtelAttr
class TokenSpendProcessor(SpanProcessor): """Side-car processor that counts model spend per agent."""
def __init__(self) -> None: self.totals: dict[str, int] = {}
def on_start(self, span: Span, parent_context: Context | None = None) -> None: # No-op — counting happens once spans are finalised in `on_end`. return
def on_end(self, span: ReadableSpan) -> None: if span.name != OtelAttr.CHAT_COMPLETION_OPERATION.value: return attrs = dict(span.attributes or {}) agent = attrs.get(OtelAttr.AGENT_NAME.value, "unknown") in_tok = attrs.get(OtelAttr.INPUT_TOKENS.value, 0) out_tok = attrs.get(OtelAttr.OUTPUT_TOKENS.value, 0) self.totals[agent] = self.totals.get(agent, 0) + in_tok + out_tok
def shutdown(self) -> None: ... def force_flush(self, timeout_millis: int = 30_000) -> bool: return True
# Register alongside the framework's exporters.from opentelemetry import tracetrace.get_tracer_provider().add_span_processor(TokenSpendProcessor())Common attribute keys you’ll reach for (full list in OtelAttr):
| Use case | Attribute |
|---|---|
| Filter to one agent | OtelAttr.AGENT_NAME |
| Group by model | OtelAttr.RESPONSE_MODEL, OtelAttr.REQUEST_MODEL |
| Track conversation | OtelAttr.CONVERSATION_ID |
| Slice tool usage | OtelAttr.TOOL_NAME, OtelAttr.TOOL_TYPE |
| Workflow drilldown | OtelAttr.WORKFLOW_NAME, OtelAttr.EXECUTOR_ID, OtelAttr.EDGE_GROUP_TYPE |
Sampling — keep traces affordable
Section titled “Sampling — keep traces affordable”OTel samplers run before exporters; pair them with configure_otel_providers(...) so the framework spans inherit your sampler:
from opentelemetry import tracefrom opentelemetry.sdk.trace import TracerProviderfrom opentelemetry.sdk.trace.export import BatchSpanProcessorfrom opentelemetry.sdk.trace.sampling import ( ALWAYS_ON, ParentBased, TraceIdRatioBased,)from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporterfrom agent_framework.observability import enable_instrumentation
# 5% head-based sampling for high-volume production; always trace if upstream did.sampler = ParentBased(root=TraceIdRatioBased(0.05))provider = TracerProvider(sampler=sampler)provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))trace.set_tracer_provider(provider)
enable_instrumentation() # do NOT call configure_otel_providers — your provider winsFor tail-based sampling (keep every trace that errored or exceeded a latency budget), use the OTel Collector’s tail_sampling processor — agent-framework spans expose error.type and the duration histograms as standard inputs.
Filtering noisy spans
Section titled “Filtering noisy spans”A custom SpanProcessor can drop spans the exporter never sees — useful when you want to capture model calls but not every internal edge_group.process:
from opentelemetry.sdk.trace import ReadableSpanfrom opentelemetry.sdk.trace.export import SpanExportResult, SpanExporterfrom agent_framework.observability import OtelAttr
class DropInternalEdgeSpans(SpanExporter): """Wrap any exporter to drop edge-group spans below a configurable threshold."""
def __init__(self, inner: SpanExporter, *, min_duration_ms: float = 50.0) -> None: self._inner = inner self._min_ns = min_duration_ms * 1_000_000
def export(self, spans): keep = [ s for s in spans if s.name != OtelAttr.EDGE_GROUP_PROCESS_SPAN.value or (s.end_time - s.start_time) > self._min_ns ] return self._inner.export(keep) if keep else SpanExportResult.SUCCESS
def force_flush(self, timeout_millis: int = 30_000) -> bool: # Forward flush so graceful shutdown drains the wrapped exporter. return self._inner.force_flush(timeout_millis)
def shutdown(self) -> None: self._inner.shutdown()Wrap your real exporter in BatchSpanProcessor(DropInternalEdgeSpans(real_exporter)).
Capping metric cardinality
Section titled “Capping metric cardinality”create_metric_views() returns the framework’s default views. Append your own to drop attributes you don’t query on — high-cardinality dimensions (per-user IDs, request IDs) blow up storage costs:
from opentelemetry.sdk.metrics.view import Viewfrom agent_framework.observability import ( configure_otel_providers, create_metric_views, OtelAttr,)
views = create_metric_views() + [ # Aggregate token usage without conversation_id — keep model + agent dimensions only. View( instrument_name=OtelAttr.LLM_TOKEN_USAGE.value, attribute_keys={ OtelAttr.RESPONSE_MODEL.value, OtelAttr.AGENT_NAME.value, OtelAttr.T_TYPE.value, }, ),]
configure_otel_providers(views=views)Stamping your own context onto every span
Section titled “Stamping your own context onto every span”Two ways to attach business attributes — pick the one that matches lifetime:
- Per-
agent.run(...)call — passadditional_properties={"gen_ai.conversation.id": req_id}so the conversation correlator uses your upstream request ID. - Per-process resource attributes — set
OTEL_RESOURCE_ATTRIBUTES=team=payments,env=prod,deployment.version=1.4.2before launching your service. The framework attaches them to every span/metric/log automatically.
import osos.environ["OTEL_RESOURCE_ATTRIBUTES"] = "team=payments,env=prod,deployment.version=1.4.2"os.environ["OTEL_SERVICE_NAME"] = "checkout-agent"
from agent_framework.observability import configure_otel_providersconfigure_otel_providers()For request-scoped attributes use a span processor that reads from a contextvars.ContextVar you set at the API boundary — that way every framework-emitted span inherits them without changing call sites.
Correlating conversations
Section titled “Correlating conversations”The framework stamps gen_ai.conversation.id on every span in a single agent.run(...) call and across session-scoped runs. In your tracing UI, filter by that attribute to see the full conversation on one trace.
Pass a custom ID through additional_properties on Agent(...) to use your own correlation key (e.g. a request ID from upstream).
Azure Monitor
Section titled “Azure Monitor”One-liner for Application Insights:
from azure.monitor.opentelemetry import configure_azure_monitorfrom agent_framework.observability import enable_instrumentation
configure_azure_monitor(connection_string="InstrumentationKey=...;IngestionEndpoint=...")enable_instrumentation()Then query in Application Insights:
- All agent runs —
dependencies | where name == "invoke_agent" - Slow chat calls —
dependencies | where name == "chat" | where duration > 3000 | summarize count() by customDimensions.["gen_ai.response.model"] - Tool-call volume —
dependencies | where name == "execute_tool" | summarize count() by customDimensions.["gen_ai.tool.name"]
Production checklist
Section titled “Production checklist”- Set
OTEL_SERVICE_NAMEandOTEL_RESOURCE_ATTRIBUTES(team, env, version). - Leave
ENABLE_SENSITIVE_DATAoff unless you’re scrubbing downstream. - Use OTLP gRPC (
OTEL_EXPORTER_OTLP_PROTOCOL=grpc) to an in-cluster collector; batch there rather than from each pod. - Wire a
LogRecordExportertoo — tool stack traces and middleware failures land in logs, not spans. - Filter noise with
create_metric_views()and pass extraView(instrument_name="...", aggregation=...)entries to cap cardinality on high-traffic deployments. - Disable the
user-agenttelemetry banner withAGENT_FRAMEWORK_USER_AGENT_TELEMETRY_DISABLED=trueif corporate policy forbids it.
Disabling everything
Section titled “Disabling everything”ENABLE_INSTRUMENTATION defaults to false. Without either env var or enable_instrumentation(), no signals are produced — even if OTel SDK is configured elsewhere.
To hard-disable mid-run, unset OBSERVABILITY_SETTINGS.enable_instrumentation = False from the same module — but prefer controlling it at startup.