LLM SDK Compatibility Guide (Late 2025)
This document details Agent-Gantry’s compatibility with major Python LLM SDKs as of late 2025.
Overview
Agent-Gantry is designed to work seamlessly with leading LLM providers. This guide covers:
- Installation and setup
- Client initialization patterns
- Key endpoint methods
- Integration examples with Agent-Gantry
- Known incompatibilities and workarounds
Supported LLM Providers
| Provider | Package | Status | Notes |
|---|---|---|---|
| OpenAI | openai |
✅ Full Support | Primary integration |
| Azure OpenAI | openai |
✅ Full Support | Uses AzureOpenAI client |
| Anthropic | anthropic |
✅ Full Support | Claude models |
| Google GenAI | google-genai |
✅ Full Support | For prototyping |
| Google Vertex AI | google-cloud-aiplatform |
✅ Full Support | Production recommended |
| Mistral | mistralai |
✅ Full Support | Including agents API |
| Groq | groq |
✅ Full Support | Fast inference |
| OpenRouter | openai |
✅ Full Support | Via base_url override |
Installation
Install Agent-Gantry with LLM provider support:
# Install with all LLM providers
pip install agent-gantry[llm-providers]
# Or install specific providers
pip install agent-gantry[openai]
pip install agent-gantry[anthropic]
pip install agent-gantry[google-genai]
pip install agent-gantry[google-vertexai]
pip install agent-gantry[mistral]
pip install agent-gantry[groq]
# Install all dependencies including LLM providers
pip install agent-gantry[all]
OpenAI
Package
pip install openai>=1.0.0
Client Initialization
from openai import OpenAI
# Standard initialization
client = OpenAI(api_key="your-api-key")
# Or with environment variable (OPENAI_API_KEY)
client = OpenAI()
Key Methods
Chat Completions
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
Responses API (Modern)
# The newer Responses API with simpler tool format
response = client.responses.create(
model="gpt-4o",
instructions="You are a helpful assistant.",
input="Hello!"
)
print(response.output_text)
Realtime API (Beta)
# WebSocket-based realtime conversations
async with client.beta.realtime.connect(model="gpt-4o-realtime-preview") as conn:
await conn.send({"type": "input_audio_buffer.append", "audio": audio_data})
async for event in conn:
if event.type == "response.audio.delta":
# Handle audio response
pass
Audio Transcriptions
with open("audio.mp3", "rb") as audio_file:
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
print(transcription.text)
Agent-Gantry Integration
from openai import OpenAI
from agent_gantry import AgentGantry
# Initialize both
client = OpenAI()
gantry = AgentGantry()
@gantry.register(tags=["utility"])
def get_current_weather(location: str) -> str:
"""Get weather for a location."""
return f"Sunny, 72°F in {location}"
await gantry.sync()
# Get tools in OpenAI Chat Completions format (default)
tools = await gantry.retrieve_tools("What's the weather?", limit=5)
# Use with OpenAI Chat Completions API
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather in SF?"}],
tools=tools
)
# Or get tools in OpenAI Responses API format
tools_responses = await gantry.retrieve_tools(
"What's the weather?",
limit=5,
dialect="openai_responses"
)
# Use with OpenAI Responses API
response = client.responses.create(
model="gpt-4o",
input="What's the weather in SF?",
tools=tools_responses
)
Azure OpenAI
Package
pip install openai>=1.0.0
Client Initialization
from openai import AzureOpenAI
client = AzureOpenAI(
api_key="your-azure-api-key",
api_version="2024-10-21",
azure_endpoint="https://your-resource.openai.azure.com"
)
Key Methods
Chat Completions
response = client.chat.completions.create(
model="gpt-4o", # Your deployment name
messages=[
{"role": "user", "content": "Hello!"}
]
)
Responses API
# The Responses API uses a different format for tools and responses
response = client.responses.create(
model="gpt-4o", # Your deployment name
input="Analyze this document and extract key points",
tools=[
{
"type": "function",
"name": "extract_key_points",
"description": "Extract key points from text",
"parameters": {
"type": "object",
"properties": {
"text": {"type": "string"}
},
"required": ["text"]
}
}
]
)
# Handle function calls from Responses API
for output in response.output:
if output.type == "function_call":
print(f"Tool called: {output.name}")
print(f"Arguments: {output.arguments}")
# Execute tool and send result back
result = client.responses.create(
model="gpt-4o",
previous_response_id=response.id,
input=[{
"type": "function_call_output",
"call_id": output.call_id,
"output": "your tool result here"
}]
)
Agent-Gantry Integration
from openai import AzureOpenAI
from agent_gantry import AgentGantry
client = AzureOpenAI(
api_key="your-key",
api_version="2024-10-21",
azure_endpoint="https://your-resource.openai.azure.com"
)
gantry = AgentGantry()
# Register tools and use with Azure OpenAI
tools = await gantry.retrieve_tools("your query")
response = client.chat.completions.create(
model="your-deployment",
messages=[{"role": "user", "content": "query"}],
tools=tools
)
Anthropic (Claude)
Package
pip install anthropic>=0.40.0
Client Initialization
from anthropic import Anthropic
# Standard initialization
client = Anthropic(api_key="your-api-key")
# Or with environment variable (ANTHROPIC_API_KEY)
client = Anthropic()
Key Methods
Messages
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude!"}
]
)
print(response.content[0].text)
Prompt Caching (Beta)
# Cache long system prompts for efficiency
response = client.beta.prompt_caching.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[{
"type": "text",
"text": "Long system prompt...",
"cache_control": {"type": "ephemeral"}
}],
messages=[{"role": "user", "content": "Question?"}]
)
Agent-Gantry Integration
from anthropic import Anthropic
from agent_gantry import AgentGantry
client = Anthropic()
gantry = AgentGantry()
@gantry.register
def search_database(query: str) -> str:
"""Search the database."""
return f"Results for: {query}"
await gantry.sync()
# Convert tools to Anthropic format
openai_tools = await gantry.retrieve_tools("search for data")
# Transform OpenAI format to Anthropic format
anthropic_tools = [
{
"name": tool["function"]["name"],
"description": tool["function"]["description"],
"input_schema": tool["function"]["parameters"]
}
for tool in openai_tools
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=anthropic_tools,
messages=[{"role": "user", "content": "Search for user data"}]
)
Google GenAI
Note: For production workloads, use Vertex AI instead. Google GenAI is recommended for prototyping.
Package
pip install google-genai>=1.0.0
Client Initialization
from google import genai
# Standard initialization with API key
client = genai.Client(api_key="your-api-key")
# Or with environment variable (GOOGLE_API_KEY)
client = genai.Client()
Key Methods
Generate Content
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Hello, Gemini!"
)
print(response.text)
Streaming
for chunk in client.models.generate_content_stream(
model="gemini-2.0-flash",
contents="Write a story about a robot."
):
print(chunk.text, end="")
Agent-Gantry Integration
from google import genai
from agent_gantry import AgentGantry
client = genai.Client()
gantry = AgentGantry()
@gantry.register
def calculate(a: float, b: float, operation: str) -> str:
"""Perform a math operation on two numbers."""
if operation == "add":
return str(a + b)
elif operation == "subtract":
return str(a - b)
elif operation == "multiply":
return str(a * b)
elif operation == "divide":
return str(a / b) if b != 0 else "Error: Division by zero"
return "Error: Unknown operation"
await gantry.sync()
# Get tools and use with Gemini
tools = await gantry.retrieve_tools("calculate something")
# Use Gemini's function calling (format transformation may be needed)
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="What is 2 + 2?",
tools=tools # May require format transformation
)
Google Vertex AI
Recommended for production workloads.
Package
pip install google-cloud-aiplatform>=1.70.0
Setup
import vertexai
from vertexai.generative_models import GenerativeModel
# Initialize Vertex AI
vertexai.init(
project="your-project-id",
location="us-central1"
)
Key Methods
Generate Content
model = GenerativeModel("gemini-2.0-flash")
response = model.generate_content("Hello, Vertex AI!")
print(response.text)
Chat Sessions
model = GenerativeModel("gemini-2.0-flash")
chat = model.start_chat()
response = chat.send_message("What's the weather like?")
print(response.text)
# Continue the conversation
response = chat.send_message("And tomorrow?")
print(response.text)
Agent-Gantry Integration
import vertexai
from vertexai.generative_models import GenerativeModel, Tool, FunctionDeclaration
from agent_gantry import AgentGantry
vertexai.init(project="your-project", location="us-central1")
gantry = AgentGantry()
@gantry.register
def get_stock_price(symbol: str) -> str:
"""Get current stock price."""
return f"${symbol}: $150.00"
await gantry.sync()
# Retrieve tools and convert for Vertex AI
openai_tools = await gantry.retrieve_tools("stock price")
# Convert to Vertex AI FunctionDeclaration format
vertex_functions = [
FunctionDeclaration(
name=tool["function"]["name"],
description=tool["function"]["description"],
parameters=tool["function"]["parameters"]
)
for tool in openai_tools
]
model = GenerativeModel(
"gemini-2.0-flash",
tools=[Tool(function_declarations=vertex_functions)]
)
response = model.generate_content("What's AAPL stock price?")
Mistral
Package
pip install mistralai>=1.0.0
Client Initialization
from mistralai import Mistral
client = Mistral(api_key="your-api-key")
Key Methods
Chat Complete
response = client.chat.complete(
model="mistral-large-latest",
messages=[
{"role": "user", "content": "Hello, Mistral!"}
]
)
print(response.choices[0].message.content)
Fill-in-the-Middle (FIM)
# Code completion
response = client.fim.complete(
model="codestral-latest",
prompt="def fibonacci(n):",
suffix=" return result"
)
print(response.choices[0].message.content)
Agents
# Using Mistral's agent capabilities
response = client.agents.complete(
agent_id="your-agent-id",
messages=[
{"role": "user", "content": "Help me with a task"}
]
)
Agent-Gantry Integration
from mistralai import Mistral
from agent_gantry import AgentGantry
client = Mistral(api_key="your-key")
gantry = AgentGantry()
@gantry.register
def send_notification(message: str, channel: str) -> str:
"""Send a notification."""
return f"Sent '{message}' to {channel}"
await gantry.sync()
# Get tools in OpenAI-compatible format
tools = await gantry.retrieve_tools("send notification")
# Use with Mistral (OpenAI-compatible format)
response = client.chat.complete(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Notify the team"}],
tools=tools
)
Groq
Package
pip install groq>=0.13.0
Client Initialization
from groq import Groq
client = Groq(api_key="your-api-key")
# Or with environment variable (GROQ_API_KEY)
client = Groq()
Key Methods
Chat Completions
response = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[
{"role": "user", "content": "Hello, Groq!"}
]
)
print(response.choices[0].message.content)
Agent-Gantry Integration
from groq import Groq
from agent_gantry import AgentGantry
client = Groq()
gantry = AgentGantry()
@gantry.register
def analyze_text(text: str) -> str:
"""Analyze text sentiment."""
return "Positive sentiment detected"
await gantry.sync()
# Get tools in OpenAI-compatible format
tools = await gantry.retrieve_tools("analyze text")
# Use with Groq (fully OpenAI-compatible)
response = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[{"role": "user", "content": "Analyze: I love this!"}],
tools=tools
)
OpenRouter (and OpenAI-Compatible APIs)
OpenRouter and other OpenAI-compatible providers (DeepSeek, Perplexity, Together AI, etc.) can be used via the standard OpenAI client with a custom base_url.
Package
pip install openai>=1.0.0
Client Initialization
from openai import OpenAI
# OpenRouter
client = OpenAI(
api_key="your-openrouter-key",
base_url="https://openrouter.ai/api/v1"
)
# DeepSeek
client = OpenAI(
api_key="your-deepseek-key",
base_url="https://api.deepseek.com/v1"
)
# Perplexity
client = OpenAI(
api_key="your-perplexity-key",
base_url="https://api.perplexity.ai"
)
Key Methods
# Standard OpenAI-compatible chat completions
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4", # OpenRouter model format
messages=[
{"role": "user", "content": "Hello!"}
]
)
Agent-Gantry Integration
from openai import OpenAI
from agent_gantry import AgentGantry
# OpenRouter client
client = OpenAI(
api_key="your-openrouter-key",
base_url="https://openrouter.ai/api/v1"
)
gantry = AgentGantry()
@gantry.register
def web_search(query: str) -> str:
"""Search the web."""
return f"Results for: {query}"
await gantry.sync()
# Tools work with any OpenAI-compatible provider
tools = await gantry.retrieve_tools("search the web")
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Search for Python tutorials"}],
tools=tools
)
Tool Format Conversion
Agent-Gantry provides OpenAI Chat Completions compatible tool schemas by default. Here’s how to convert them for other providers:
OpenAI Chat Completions Format (Default)
{
"type": "function",
"function": {
"name": "my_tool",
"description": "Tool description",
"parameters": {
"type": "object",
"properties": {...},
"required": [...]
}
}
}
OpenAI Responses API Format
# Use dialect="openai_responses" when retrieving tools
tools = await gantry.retrieve_tools("query", dialect="openai_responses")
# Format structure:
{
"type": "function",
"name": "my_tool",
"description": "Tool description",
"parameters": {
"type": "object",
"properties": {...},
"required": [...]
}
}
# Tool call format (from response.output):
{
"type": "function_call",
"call_id": "call_xxx",
"name": "my_tool",
"arguments": "{\"arg\": \"value\"}"
}
# Tool result format (to send back):
{
"type": "function_call_output",
"call_id": "call_xxx",
"output": "result string"
}
Anthropic Format
def to_anthropic_tools(openai_tools):
return [
{
"name": tool["function"]["name"],
"description": tool["function"]["description"],
"input_schema": tool["function"]["parameters"]
}
for tool in openai_tools
]
Vertex AI Format
from vertexai.generative_models import FunctionDeclaration
def to_vertex_functions(openai_tools):
return [
FunctionDeclaration(
name=tool["function"]["name"],
description=tool["function"]["description"],
parameters=tool["function"]["parameters"]
)
for tool in openai_tools
]
Known Incompatibilities and Workarounds
1. Google GenAI vs Legacy google-generativeai
Issue: The google-generativeai package is deprecated in favor of google-genai.
Workaround: Use google-genai for new projects:
# Old (deprecated)
# import google.generativeai as genai
# New (recommended)
from google import genai
2. OpenAI Chat Completions vs Responses API
Issue: OpenAI has two APIs with different tool formats:
- Chat Completions (
client.chat.completions.create): Traditional API with nestedfunctionkey - Responses API (
client.responses.create): Newer API with flattened tool schema
Workaround: Agent-Gantry supports both via the dialect parameter:
# For Chat Completions API (default)
tools = await gantry.retrieve_tools("query", dialect="openai")
# For Responses API
tools = await gantry.retrieve_tools("query", dialect="openai_responses")
3. Tool Schema Differences
Issue: Different providers have slightly different tool schema formats.
Workaround: Use Agent-Gantry’s OpenAI-compatible output and transform as needed (see Tool Format Conversion section above).
4. Streaming Differences
Issue: Streaming implementations vary across providers.
Workaround: Normalize streaming handling in your application layer:
# OpenAI/Azure/Groq/Mistral
for chunk in response:
if chunk.choices[0].delta.content:
yield chunk.choices[0].delta.content
# Anthropic
for chunk in response:
if chunk.type == "content_block_delta":
yield chunk.delta.text
# Google GenAI
for chunk in response:
yield chunk.text
Environment Variables
| Provider | Environment Variable |
|---|---|
| OpenAI | OPENAI_API_KEY |
| Azure OpenAI | AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT |
| Anthropic | ANTHROPIC_API_KEY |
| Google GenAI | GOOGLE_API_KEY |
| Google Vertex AI | GOOGLE_APPLICATION_CREDENTIALS (service account) |
| Mistral | MISTRAL_API_KEY |
| Groq | GROQ_API_KEY |
| OpenRouter | OPENROUTER_API_KEY |
Best Practices
-
Use Agent-Gantry for Tool Management: Let Agent-Gantry handle tool registration, semantic routing, and execution while using LLM providers for inference.
-
Prefer OpenAI-Compatible Format: Agent-Gantry outputs OpenAI-compatible tool schemas which work with most providers directly.
-
Feature Detection: Check for provider-specific features before using them to maintain portability.
-
Error Handling: Implement proper error handling for API calls, as error formats vary across providers.
-
Rate Limiting: Be aware of different rate limits across providers and implement appropriate backoff strategies.