Semantic Kernel Streaming Server (FastAPI, Python)
Semantic Kernel Streaming Server (FastAPI, Python)
Section titled “Semantic Kernel Streaming Server (FastAPI, Python)”Latest: 1.41.2 | Updated: April 2026 Last verified: 2025-11
This example streams staged events from a Semantic Kernel workflow; token-level streaming may depend on your SK function/service.
from fastapi import FastAPIfrom fastapi.responses import StreamingResponseimport semantic_kernel as skfrom semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletionimport os
app = FastAPI()
kernel = sk.Kernel()kernel.add_chat_service("openai", OpenAIChatCompletion(model_id="gpt-4o-mini", api_key=os.environ["OPENAI_API_KEY"]))fn = kernel.create_function_from_prompt("Summarize: {{$input}} in 3 bullets")
@app.get("/stream")def stream(q: str): async def run(): yield "data: {\"event\": \"invoke\"}\n\n" result = await kernel.invoke_async(fn, input_text=q) yield f"data: {{\"final\": {result!r} }}\n\n"
return StreamingResponse(run(), media_type="text/event-stream")Deployment
Section titled “Deployment”Dockerfile
Section titled “Dockerfile”FROM python:3.11-slimWORKDIR /appCOPY requirements.txt .RUN pip install -r requirements.txtCOPY . .EXPOSE 8080CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080"]Kubernetes
Section titled “Kubernetes”apiVersion: apps/v1kind: Deploymentmetadata: { name: sk-stream }spec: replicas: 2 selector: { matchLabels: { app: sk-stream } } template: metadata: { labels: { app: sk-stream } } spec: containers: - name: app image: ghcr.io/yourorg/sk-stream:latest env: [{ name: OPENAI_API_KEY, valueFrom: { secretKeyRef: { name: openai-secrets, key: apiKey } } }] ports: [{ containerPort: 8080 }]---apiVersion: v1kind: Servicemetadata: { name: sk-stream }spec: { selector: { app: sk-stream }, ports: [{ port: 80, targetPort: 8080 }] }Security Best Practices
Section titled “Security Best Practices”- Authenticate SSE clients; implement rate limiting and timeouts
- Store API keys in secret managers; avoid printing model outputs in logs