Skip to content

Microsoft Agent Framework (Python) — Observability & Telemetry

agent-framework-core emits OpenTelemetry signals following the GenAI semantic conventions. Every agent run, model call, tool invocation, workflow executor, and edge group is a span; every chat completion emits a duration histogram and a token-usage histogram. Nothing is exported by default — you opt in either by calling a helper, setting one env var, or wiring your own OTel providers.

Verified against agent-framework-core==1.2.2 (agent_framework.observability).

Pick exactly one. Mixing them causes duplicate providers.

1. configure_otel_providers() — batteries included

Section titled “1. configure_otel_providers() — batteries included”

Auto-wires OTLP exporters from standard OTel env vars and enables instrumentation in one call.

from agent_framework.observability import configure_otel_providers
configure_otel_providers()

Set the usual OTel env vars before calling:

Terminal window
export ENABLE_INSTRUMENTATION=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_SERVICE_NAME=my-agent

For local debugging, route everything to stdout:

configure_otel_providers(enable_console_exporters=True)

2. Bring your own providers — just enable instrumentation

Section titled “2. Bring your own providers — just enable instrumentation”

When your app already owns OTel setup (Azure Monitor, Datadog, Honeycomb agents), skip configure_otel_providers and call enable_instrumentation():

from azure.monitor.opentelemetry import configure_azure_monitor
from agent_framework.observability import enable_instrumentation
configure_azure_monitor(connection_string=AZURE_MONITOR_CONNECTION)
enable_instrumentation()

3. Custom exporters, keeping the framework’s providers

Section titled “3. Custom exporters, keeping the framework’s providers”

Pass your own exporters and optional Views; framework-managed providers are still created:

from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter
from opentelemetry.sdk.metrics.view import View
from agent_framework.observability import configure_otel_providers, create_metric_views
configure_otel_providers(
exporters=[
OTLPSpanExporter(endpoint="http://otel-collector:4317"),
OTLPLogExporter(endpoint="http://otel-collector:4317"),
],
views=create_metric_views(), # default GenAI + agent_framework views
)
Span nameSourceKey attributes
invoke_agentAgent.run()gen_ai.agent.name, gen_ai.agent.id, gen_ai.conversation.id
chatEvery chat-client callgen_ai.request.model, gen_ai.response.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons
execute_toolEach tool invocationgen_ai.tool.name, gen_ai.tool.call.id, gen_ai.tool.call.arguments, gen_ai.tool.call.result
workflow.build / workflow.runWorkflow lifecycleworkflow.name, workflow.id
executor.processEach executor runexecutor.id, executor.type
edge_group.processEach edge traversaledge_group.id, edge_group.type, edge_group.delivery_status
message.sendCross-executor messagingmessage.source_id, message.target_id, message.type

All attribute constants live in agent_framework.observability.OtelAttr — use them to build span processors and samplers rather than typing strings.

Two histograms emit automatically once instrumentation is on:

MetricUnitDimensions
gen_ai.client.operation.durationsecondsgen_ai.operation.name, gen_ai.request.model, gen_ai.response.model, gen_ai.provider.name, error.type
gen_ai.client.token.usagetokensSame + gen_ai.token.type (input | output)

Plus agent_framework.function.invocation.duration per tool call.

Agent-framework forwards to the standard Python logging module. Configure via dictConfig:

import logging
logging.basicConfig(level=logging.INFO)
logging.getLogger("agent_framework").setLevel(logging.DEBUG)

When instrumentation is enabled, logs flow through OTel’s LoggingHandler and ship alongside traces/metrics.

Prompts and completions are redacted by default. Flip enable_sensitive_data=True to include full message bodies as span events — useful in dev, avoid in production.

configure_otel_providers(enable_sensitive_data=True)
# or: export ENABLE_SENSITIVE_DATA=true

The AI Toolkit and Azure AI Foundry VS Code extensions listen on a local port for live traces. Point the framework at it:

configure_otel_providers(vs_code_extension_port=4317)
# or: export VS_CODE_EXTENSION_PORT=4317

Open the “AI Toolkit: Traces” view — every agent run, tool call, and workflow edge shows up with inputs, outputs, and timing.

You can still wrap your own code in spans; they slot neatly into the framework’s tree.

from opentelemetry import trace
tracer = trace.get_tracer("myapp.customer")
with tracer.start_as_current_span("process_order", attributes={"order.id": order_id}):
response = await agent.run(f"Draft reply for order {order_id}")

For agent-level wrapping use AgentMiddleware instead — see the middleware page.

Use the framework’s tracer / meter helpers

Section titled “Use the framework’s tracer / meter helpers”

agent_framework.observability.get_tracer() and get_meter() return providers that share the same configuration the framework uses internally. Calling them ensures your custom spans land alongside invoke_agent / chat / execute_tool and inherit the same resource attributes (service name, version):

from agent_framework.observability import get_tracer, get_meter
tracer = get_tracer("myapp.pricing")
meter = get_meter("myapp.pricing")
cache_hits = meter.create_counter(
"myapp.pricing.cache_hits",
description="Pricing cache hit count",
unit="{request}",
)
async def price_quote(sku: str, *, cache, agent):
with tracer.start_as_current_span("price_quote", attributes={"sku": sku}):
if cached := cache.get(sku):
cache_hits.add(1, {"sku.tier": cached.tier})
return cached
return await agent.run(f"Price quote for {sku}")

Reading framework attributes with OtelAttr

Section titled “Reading framework attributes with OtelAttr”

OtelAttr is the single source of truth for every span attribute the framework emits. Use it to filter, sample, or post-process spans without typing string keys:

from opentelemetry.context import Context
from opentelemetry.sdk.trace import ReadableSpan, Span
from opentelemetry.sdk.trace.export import SpanProcessor
from agent_framework.observability import OtelAttr
class TokenSpendProcessor(SpanProcessor):
"""Side-car processor that counts model spend per agent."""
def __init__(self) -> None:
self.totals: dict[str, int] = {}
def on_start(self, span: Span, parent_context: Context | None = None) -> None:
# No-op — counting happens once spans are finalised in `on_end`.
return
def on_end(self, span: ReadableSpan) -> None:
if span.name != OtelAttr.CHAT_COMPLETION_OPERATION.value:
return
attrs = dict(span.attributes or {})
agent = attrs.get(OtelAttr.AGENT_NAME.value, "unknown")
in_tok = attrs.get(OtelAttr.INPUT_TOKENS.value, 0)
out_tok = attrs.get(OtelAttr.OUTPUT_TOKENS.value, 0)
self.totals[agent] = self.totals.get(agent, 0) + in_tok + out_tok
def shutdown(self) -> None: ...
def force_flush(self, timeout_millis: int = 30_000) -> bool: return True
# Register alongside the framework's exporters.
from opentelemetry import trace
trace.get_tracer_provider().add_span_processor(TokenSpendProcessor())

Common attribute keys you’ll reach for (full list in OtelAttr):

Use caseAttribute
Filter to one agentOtelAttr.AGENT_NAME
Group by modelOtelAttr.RESPONSE_MODEL, OtelAttr.REQUEST_MODEL
Track conversationOtelAttr.CONVERSATION_ID
Slice tool usageOtelAttr.TOOL_NAME, OtelAttr.TOOL_TYPE
Workflow drilldownOtelAttr.WORKFLOW_NAME, OtelAttr.EXECUTOR_ID, OtelAttr.EDGE_GROUP_TYPE

OTel samplers run before exporters; pair them with configure_otel_providers(...) so the framework spans inherit your sampler:

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.trace.sampling import (
ALWAYS_ON,
ParentBased,
TraceIdRatioBased,
)
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from agent_framework.observability import enable_instrumentation
# 5% head-based sampling for high-volume production; always trace if upstream did.
sampler = ParentBased(root=TraceIdRatioBased(0.05))
provider = TracerProvider(sampler=sampler)
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)
enable_instrumentation() # do NOT call configure_otel_providers — your provider wins

For tail-based sampling (keep every trace that errored or exceeded a latency budget), use the OTel Collector’s tail_sampling processor — agent-framework spans expose error.type and the duration histograms as standard inputs.

A custom SpanProcessor can drop spans the exporter never sees — useful when you want to capture model calls but not every internal edge_group.process:

from opentelemetry.sdk.trace import ReadableSpan
from opentelemetry.sdk.trace.export import SpanExportResult, SpanExporter
from agent_framework.observability import OtelAttr
class DropInternalEdgeSpans(SpanExporter):
"""Wrap any exporter to drop edge-group spans below a configurable threshold."""
def __init__(self, inner: SpanExporter, *, min_duration_ms: float = 50.0) -> None:
self._inner = inner
self._min_ns = min_duration_ms * 1_000_000
def export(self, spans):
keep = [
s for s in spans
if s.name != OtelAttr.EDGE_GROUP_PROCESS_SPAN.value
or (s.end_time - s.start_time) > self._min_ns
]
return self._inner.export(keep) if keep else SpanExportResult.SUCCESS
def force_flush(self, timeout_millis: int = 30_000) -> bool:
# Forward flush so graceful shutdown drains the wrapped exporter.
return self._inner.force_flush(timeout_millis)
def shutdown(self) -> None:
self._inner.shutdown()

Wrap your real exporter in BatchSpanProcessor(DropInternalEdgeSpans(real_exporter)).

create_metric_views() returns the framework’s default views. Append your own to drop attributes you don’t query on — high-cardinality dimensions (per-user IDs, request IDs) blow up storage costs:

from opentelemetry.sdk.metrics.view import View
from agent_framework.observability import (
configure_otel_providers,
create_metric_views,
OtelAttr,
)
views = create_metric_views() + [
# Aggregate token usage without conversation_id — keep model + agent dimensions only.
View(
instrument_name=OtelAttr.LLM_TOKEN_USAGE.value,
attribute_keys={
OtelAttr.RESPONSE_MODEL.value,
OtelAttr.AGENT_NAME.value,
OtelAttr.T_TYPE.value,
},
),
]
configure_otel_providers(views=views)

Two ways to attach business attributes — pick the one that matches lifetime:

  1. Per-agent.run(...) call — pass additional_properties={"gen_ai.conversation.id": req_id} so the conversation correlator uses your upstream request ID.
  2. Per-process resource attributes — set OTEL_RESOURCE_ATTRIBUTES=team=payments,env=prod,deployment.version=1.4.2 before launching your service. The framework attaches them to every span/metric/log automatically.
import os
os.environ["OTEL_RESOURCE_ATTRIBUTES"] = "team=payments,env=prod,deployment.version=1.4.2"
os.environ["OTEL_SERVICE_NAME"] = "checkout-agent"
from agent_framework.observability import configure_otel_providers
configure_otel_providers()

For request-scoped attributes use a span processor that reads from a contextvars.ContextVar you set at the API boundary — that way every framework-emitted span inherits them without changing call sites.

The framework stamps gen_ai.conversation.id on every span in a single agent.run(...) call and across session-scoped runs. In your tracing UI, filter by that attribute to see the full conversation on one trace.

Pass a custom ID through additional_properties on Agent(...) to use your own correlation key (e.g. a request ID from upstream).

One-liner for Application Insights:

from azure.monitor.opentelemetry import configure_azure_monitor
from agent_framework.observability import enable_instrumentation
configure_azure_monitor(connection_string="InstrumentationKey=...;IngestionEndpoint=...")
enable_instrumentation()

Then query in Application Insights:

  • All agent runsdependencies | where name == "invoke_agent"
  • Slow chat callsdependencies | where name == "chat" | where duration > 3000 | summarize count() by customDimensions.["gen_ai.response.model"]
  • Tool-call volumedependencies | where name == "execute_tool" | summarize count() by customDimensions.["gen_ai.tool.name"]
  • Set OTEL_SERVICE_NAME and OTEL_RESOURCE_ATTRIBUTES (team, env, version).
  • Leave ENABLE_SENSITIVE_DATA off unless you’re scrubbing downstream.
  • Use OTLP gRPC (OTEL_EXPORTER_OTLP_PROTOCOL=grpc) to an in-cluster collector; batch there rather than from each pod.
  • Wire a LogRecordExporter too — tool stack traces and middleware failures land in logs, not spans.
  • Filter noise with create_metric_views() and pass extra View(instrument_name="...", aggregation=...) entries to cap cardinality on high-traffic deployments.
  • Disable the user-agent telemetry banner with AGENT_FRAMEWORK_USER_AGENT_TELEMETRY_DISABLED=true if corporate policy forbids it.

ENABLE_INSTRUMENTATION defaults to false. Without either env var or enable_instrumentation(), no signals are produced — even if OTel SDK is configured elsewhere.

To hard-disable mid-run, unset OBSERVABILITY_SETTINGS.enable_instrumentation = False from the same module — but prefer controlling it at startup.