Haystack Streaming Server (FastAPI)
Haystack Streaming Server (FastAPI)
Section titled “Haystack Streaming Server (FastAPI)”Latest: 2.27.0 | Updated: April 2026 Last verified: 2025-11
This example streams simple progress events and a final answer. Token-level streaming depends on the underlying generator.
from fastapi import FastAPIfrom fastapi.responses import StreamingResponse
app = FastAPI()
@app.get("/stream")def stream(q: str): def gen(): yield "data: {\"event\": \"start\"}\n\n" # TODO: integrate Haystack pipeline execution here yield "data: {\"event\": \"retrieval\"}\n\n" yield "data: {\"event\": \"generation\"}\n\n" yield f"data: {{\"final\": \"answer for {q}\"}}\n\n" return StreamingResponse(gen(), media_type="text/event-stream")Deployment
Section titled “Deployment”Dockerfile
Section titled “Dockerfile”FROM python:3.11-slimWORKDIR /appCOPY requirements.txt .RUN pip install -r requirements.txtCOPY . .EXPOSE 8080CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080"]Kubernetes
Section titled “Kubernetes”apiVersion: apps/v1kind: Deploymentmetadata: { name: haystack-stream }spec: replicas: 2 selector: { matchLabels: { app: haystack-stream } } template: metadata: { labels: { app: haystack-stream } } spec: containers: - name: app image: ghcr.io/yourorg/haystack-stream:latest ports: [{ containerPort: 8080 }]---apiVersion: v1kind: Servicemetadata: { name: haystack-stream }spec: { selector: { app: haystack-stream }, ports: [{ port: 80, targetPort: 8080 }] }Security Best Practices
Section titled “Security Best Practices”- Authenticate clients, limit burst traffic, set server timeouts
- Avoid logging full prompts/responses; mask PII