Skip to main content

Dynamic MCP Server Selection

Status: βœ… Functional (using pseudo-tool storage approach)

Overview

Dynamic MCP Server Selection enables Agent-Gantry to intelligently route queries to the most relevant MCP servers using semantic search, similar to how tools are selected. This feature allows you to register multiple MCP servers with rich metadata and have Agent-Gantry automatically determine which servers to connect to based on the user’s query.

Implementation Note: MCP servers are stored as pseudo-tools in the vector store with special metadata tags for entity type discrimination. This approach allows us to leverage the existing vector store infrastructure while maintaining full functionality. A future enhancement could add native multi-entity support to the vector store layer.

Key Benefits

🎯 Intelligent Server Selection

  • Semantic Routing: Uses vector embeddings to find servers relevant to the query
  • Context-Aware: Considers tags, examples, capabilities, and descriptions
  • Flexible Filtering: Filter by capabilities, namespaces, or health status

⚑ Performance & Efficiency

  • Lazy Loading: Connects to servers only when needed
  • No Upfront Cost: Register servers without immediate connections
  • Reduced Initialization Time: Skip connecting to unused servers

πŸ”’ Security & Resource Management

  • Minimal Attack Surface: Only connects to necessary servers
  • Health Tracking: Monitors server availability and connection success
  • Capability-Based Access: Ensures servers have required capabilities

🧩 Plug & Play Philosophy

  • Simple Registration: Register servers with intuitive metadata
  • Backward Compatible: Existing add_mcp_server() continues to work
  • Consistent API: Follows same patterns as tool registration

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    User Query                                β”‚
β”‚              "I need to read a file"                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              retrieve_mcp_servers()                          β”‚
β”‚           (Semantic Search on Metadata)                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           Ranked MCP Servers                                 β”‚
β”‚    1. filesystem (0.92 relevance)                           β”‚
β”‚    2. storage (0.78 relevance)                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚        discover_tools_from_server()                         β”‚
β”‚          (Lazy Connect & Import)                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Available Tools                                 β”‚
β”‚       - read_file                                           β”‚
β”‚       - write_file                                          β”‚
β”‚       - list_directory                                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Quick Start

1. Register MCP Servers

Register servers with rich metadata instead of immediately connecting:

from agent_gantry import AgentGantry

gantry = AgentGantry()

# Register filesystem server
gantry.register_mcp_server(
    name="filesystem",
    command=["npx", "-y", "@modelcontextprotocol/server-filesystem"],
    description="Provides tools for reading and writing files on the local filesystem",
    args=["--path", "/home/user/documents"],
    tags=["filesystem", "files", "io", "local"],
    examples=[
        "read a file",
        "write to a file",
        "list directory contents",
    ],
    capabilities=["read_files", "write_files", "list_directory"],
)

# Register database server
gantry.register_mcp_server(
    name="postgresql",
    command=["python", "-m", "mcp_postgresql"],
    description="Access PostgreSQL databases for querying and data manipulation",
    args=["--connection-string", "postgresql://localhost/mydb"],
    tags=["database", "sql", "data"],
    examples=["query database", "insert record", "update data"],
    capabilities=["read_data", "write_data", "execute_sql"],
)

2. Sync Server Metadata

Sync server metadata to enable semantic search:

# Embed server metadata for semantic routing
await gantry.sync_mcp_servers()

3. Discover Relevant Servers

Find servers relevant to your query:

# Semantic search for relevant servers
servers = await gantry.retrieve_mcp_servers(
    query="I need to read a configuration file",
    limit=2,
    score_threshold=0.5,
)

for server in servers:
    print(f"Server: {server.name}")
    print(f"Description: {server.description}")
    print(f"Capabilities: {server.capabilities}")

4. Load Tools On-Demand

Connect to selected servers and load their tools:

# Discover tools from the selected server
count = await gantry.discover_tools_from_server("filesystem")
print(f"Discovered {count} tools from filesystem server")

# Now use the tools
tools = await gantry.retrieve_tools("read my config.yaml")
result = await gantry.execute(...)

Complete Example

import asyncio
from agent_gantry import AgentGantry

async def main():
    # Initialize
    gantry = AgentGantry()
    
    # Register servers (no immediate connection)
    gantry.register_mcp_server(
        name="filesystem",
        command=["npx", "-y", "@modelcontextprotocol/server-filesystem"],
        description="Local filesystem operations",
        args=["--path", "/tmp"],
        tags=["files", "io"],
        examples=["read file", "write file"],
        capabilities=["read_files", "write_files"],
    )
    
    gantry.register_mcp_server(
        name="api_client",
        command=["node", "mcp-rest-api.js"],
        description="REST API client for external services",
        tags=["api", "http", "rest"],
        examples=["call API", "fetch data"],
        capabilities=["network_access", "external_api"],
    )
    
    # Sync for semantic search
    await gantry.sync_mcp_servers()
    
    # User query determines which servers to use
    user_query = "I need to fetch data from an API"
    
    # Find relevant servers
    servers = await gantry.retrieve_mcp_servers(user_query, limit=1)
    
    if servers:
        # Connect and load tools only from relevant server
        server = servers[0]
        print(f"Selected: {server.name}")
        count = await gantry.discover_tools_from_server(server.name)
        print(f"Loaded {count} tools")
    
    # Continue with tool retrieval and execution
    tools = await gantry.retrieve_tools(user_query)
    # ... use tools

asyncio.run(main())

API Reference

register_mcp_server()

Register an MCP server with metadata for semantic routing.

gantry.register_mcp_server(
    name: str,                      # Unique server name
    command: list[str],             # Command to start server
    description: str,               # What the server provides (min 10 chars)
    namespace: str = "default",     # Namespace for organization
    args: list[str] | None = None,  # Command-line arguments
    env: dict[str, str] | None = None,  # Environment variables
    tags: list[str] | None = None,  # Categorization tags
    examples: list[str] | None = None,  # Example queries
    capabilities: list[str] | None = None,  # Server capabilities
) -> None

Parameters:

  • name: Unique identifier for the server
  • command: Command to execute the MCP server process
  • description: Detailed description of server functionality (used for semantic search)
  • namespace: Logical grouping namespace (default: β€œdefault”)
  • args: Additional command-line arguments to pass
  • env: Environment variables for the server process
  • tags: List of tags for categorization and filtering
  • examples: Example queries that this server handles well
  • capabilities: List of capabilities (e.g., β€œread_files”, β€œwrite_data”)

Example:

gantry.register_mcp_server(
    name="weather_api",
    command=["python", "-m", "mcp_weather"],
    description="Access weather data from multiple providers including forecasts and alerts",
    namespace="apis",
    args=["--api-key", "xyz"],
    env={"WEATHER_PROVIDER": "openweathermap"},
    tags=["weather", "api", "forecast"],
    examples=["get weather", "forecast tomorrow", "weather alerts"],
    capabilities=["read_weather", "forecasts"],
)

retrieve_mcp_servers()

Find relevant MCP servers using semantic search.

await gantry.retrieve_mcp_servers(
    query: str,                          # Search query
    limit: int = 3,                      # Max servers to return
    score_threshold: float | None = None,  # Min similarity score
    namespaces: list[str] | None = None,  # Filter by namespaces
) -> list[MCPServerDefinition]

Parameters:

  • query: Natural language query describing needed functionality
  • limit: Maximum number of servers to return (default: 3)
  • score_threshold: Minimum similarity score (0-1) to include a server
  • namespaces: Filter results to specific namespaces

Returns: List of MCPServerDefinition objects, ordered by relevance

Example:

servers = await gantry.retrieve_mcp_servers(
    query="send email with attachments",
    limit=2,
    score_threshold=0.6,
    namespaces=["communication"],
)

discover_tools_from_server()

Connect to a server and load its tools on-demand.

await gantry.discover_tools_from_server(
    server_name: str,             # Server name
    namespace: str = "default",   # Server namespace
) -> int

Parameters:

  • server_name: Name of the registered MCP server
  • namespace: Namespace of the server (default: β€œdefault”)

Returns: Number of tools discovered and registered

Raises:

  • ValueError: If the server is not registered

Example:

count = await gantry.discover_tools_from_server("filesystem")
print(f"Loaded {count} tools from filesystem server")

sync_mcp_servers()

Sync server metadata to vector store for semantic search.

await gantry.sync_mcp_servers(
    batch_size: int = 100,   # Batch size for embedding
    force: bool = False,     # Force re-embedding
) -> int

Parameters:

  • batch_size: Number of servers to process per batch (default: 100)
  • force: Re-embed all servers regardless of changes (default: False)

Returns: Number of servers synced

Example:

synced = await gantry.sync_mcp_servers()
print(f"Synced {synced} servers")

Metadata Schema

MCPServerDefinition

Server metadata model for semantic routing:

class MCPServerDefinition(BaseModel):
    # Identity
    name: str                           # Server name
    namespace: str = "default"          # Namespace
    
    # Discovery (used for semantic search)
    description: str                    # Detailed description (min 10 chars)
    extended_description: str | None    # Additional details
    tags: list[str] = []               # Categorization tags
    examples: list[str] = []           # Example queries
    
    # Connection
    command: list[str]                 # Start command
    args: list[str] = []              # Command arguments
    env: dict[str, str] = {}          # Environment variables
    
    # Capabilities & Cost
    capabilities: list[str] = []       # Server capabilities
    cost: MCPServerCost                # Cost/latency model
    
    # Health (runtime)
    health: MCPServerHealth            # Health metrics
    
    # Metadata
    metadata: dict[str, Any] = {}     # Additional metadata
    created_at: datetime               # Registration time
    deprecated: bool = False           # Deprecation flag

Key Fields for Semantic Search:

  • description: Primary text for semantic matching
  • tags: Keywords for categorization and filtering
  • examples: Sample queries the server handles well
  • capabilities: Functional capabilities for filtering

Comparison: Traditional vs Dynamic

Traditional Approach (add_mcp_server)

# Connect to ALL servers at startup
count1 = await gantry.add_mcp_server(config1)
count2 = await gantry.add_mcp_server(config2)
count3 = await gantry.add_mcp_server(config3)
# All 3 servers connected, all tools loaded upfront

Characteristics:

  • ❌ Connects to all servers immediately
  • ❌ Loads all tools from all servers
  • ❌ High initialization cost
  • ❌ No semantic server selection
  • βœ… All tools always available

Dynamic Approach (register_mcp_server)

# Register servers (no connection)
gantry.register_mcp_server(name="server1", ...)
gantry.register_mcp_server(name="server2", ...)
gantry.register_mcp_server(name="server3", ...)
await gantry.sync_mcp_servers()

# Find relevant servers for query
servers = await gantry.retrieve_mcp_servers(query)

# Connect only to selected server
await gantry.discover_tools_from_server(servers[0].name)

Characteristics:

  • βœ… No upfront connections
  • βœ… Loads tools only when needed
  • βœ… Fast initialization
  • βœ… Semantic server selection
  • βœ… Minimal resource usage

Use Cases

1. Multi-Domain Agent Systems

Register servers for different domains and let semantic routing select the right one:

# Register domain-specific servers
gantry.register_mcp_server(name="filesystem", ...)
gantry.register_mcp_server(name="database", ...)
gantry.register_mcp_server(name="email", ...)
gantry.register_mcp_server(name="calendar", ...)

# Query determines which domain to use
servers = await gantry.retrieve_mcp_servers("schedule a meeting")
# β†’ Returns calendar server

2. Resource-Constrained Environments

Minimize connections and memory usage:

# Register 20+ servers
for server_config in all_servers:
    gantry.register_mcp_server(**server_config)

# Only connect to 2-3 most relevant servers
servers = await gantry.retrieve_mcp_servers(user_query, limit=2)
for server in servers:
    await gantry.discover_tools_from_server(server.name)

3. Capability-Based Selection

Filter servers by required capabilities:

# Find servers with specific capabilities
from agent_gantry.core.mcp_router import MCPRouter

servers = await gantry.retrieve_mcp_servers("data operations")
filtered = await gantry._mcp_router.filter_by_capabilities(
    servers,
    required_capabilities=["write_data", "execute_sql"]
)

4. Health-Aware Routing

Avoid unavailable servers automatically:

# Retrieval automatically excludes unhealthy servers
servers = await gantry.retrieve_mcp_servers(query)
# Only returns servers with health.available == True

Advanced Features

Health Tracking

Server health is automatically tracked:

# Access server health
server = gantry._mcp_registry.get_server("filesystem")
if server:
    print(f"Success rate: {server.health.success_rate}")
    print(f"Available: {server.health.available}")
    print(f"Consecutive failures: {server.health.consecutive_failures}")

Manual Health Updates

# Update server health manually
gantry._mcp_registry.update_health(
    "filesystem",
    namespace="default",
    available=False,
    consecutive_failures=3,
)

Lazy Client Management

Clients are created only when needed:

# Get or create client for a server
client = gantry._mcp_registry.get_client("filesystem")
# First call creates client, subsequent calls return same instance

Server Listing

# List all registered servers
all_servers = gantry._mcp_registry.list_servers()

# Filter by namespace
ns_servers = gantry._mcp_registry.list_servers(namespace="apis")

# Check server count
count = gantry._mcp_registry.server_count

Future Enhancements

Phase 5: Vector Store Integration

Current Status: Placeholder implementation
Goal: Full semantic search with vector embeddings

What’s Needed:

  1. Extend vector store to support multiple entity types (tools + servers)
  2. Implement metadata-based filtering in vector store
  3. Complete MCPRouter.route() with actual search results

When Complete:

# Will work with actual semantic search
servers = await gantry.retrieve_mcp_servers("read files")
# Returns servers ranked by semantic similarity

Planned Features

  • Auto-discovery: Automatically detect available MCP servers
  • Connection Pooling: Reuse connections across queries
  • Hybrid Mode: Mix static and dynamic server loading
  • Server Dependencies: Express dependencies between servers
  • Cost Optimization: Prefer low-latency servers

Troubleshooting

Server Not Found

# Check if server is registered
server = gantry._mcp_registry.get_server("myserver")
if not server:
    print("Server not registered!")

Connection Failures

try:
    count = await gantry.discover_tools_from_server("myserver")
except Exception as e:
    print(f"Failed to connect: {e}")
    # Check server health
    server = gantry._mcp_registry.get_server("myserver")
    if server and not server.health.available:
        print("Server marked as unavailable")

No Servers Returned

servers = await gantry.retrieve_mcp_servers("query", score_threshold=0.8)
if not servers:
    # Try lower threshold
    servers = await gantry.retrieve_mcp_servers("query", score_threshold=0.5)

Best Practices

  1. Rich Descriptions: Provide detailed, keyword-rich descriptions
  2. Relevant Examples: Include example queries that match your use cases
  3. Meaningful Tags: Use consistent, searchable tags
  4. Capability Modeling: Define capabilities that match your security model
  5. Health Monitoring: Check server health before critical operations
  6. Sync Regularly: Call sync_mcp_servers() after batch registrations
  7. Namespace Organization: Group related servers in namespaces
  8. Command Security: ⚠️ CRITICAL - Only use trusted commands and arguments
    • Never pass unsanitized user input to command or args fields
    • Validate all inputs to prevent command injection attacks
    • Consider using allowlists for approved commands
  9. Environment Variables: Mark sensitive data in env as confidential
    • Avoid logging environment variables that contain secrets
    • Use secure credential management systems when possible
  10. Timeout Configuration: Use appropriate timeouts for discover_tools_from_server()
    • Default is 30 seconds; adjust based on your server’s typical response time
    • Monitor timeout failures and adjust as needed

Migration Guide

From add_mcp_server() to register_mcp_server()

Before:

config = MCPServerConfig(
    name="filesystem",
    command=["npx", "-y", "@modelcontextprotocol/server-filesystem"],
    args=["--path", "/tmp"],
    namespace="default",
)
count = await gantry.add_mcp_server(config)

After:

gantry.register_mcp_server(
    name="filesystem",
    command=["npx", "-y", "@modelcontextprotocol/server-filesystem"],
    description="Local filesystem operations for reading and writing files",
    args=["--path", "/tmp"],
    namespace="default",
    tags=["filesystem", "files"],
    examples=["read file", "write file"],
    capabilities=["read_files", "write_files"],
)
await gantry.sync_mcp_servers()

# Later, discover tools on-demand
servers = await gantry.retrieve_mcp_servers("need to read a file")
if servers:
    await gantry.discover_tools_from_server(servers[0].name)

Benefits:

  • No upfront connection overhead
  • Semantic server selection
  • Better resource management

Note: Both methods work and can be used together!

See Also