The OpenRouter MCP Server

OpenRouter ·

The OpenRouter MCP Server
On this page

Your coding agent is incredible at writing code.

But when it comes to choosing the right model for, say, coding without blowing through your monthly budget in one day, or the best model for designing a landing page, it really struggles.

Your agent can make an approximate guess of the “best” model, but it’s guessing from training data that is months stale, with no knowledge of how much it costs, how well it performs for a given task, which provider you should pin it to, etc.

No more.

Today, we’re very excited to announce the release of the OpenRouter MCP.

The OpenRouter MCP server puts live model data, benchmark rankings, pricing, docs, and test inference directly to help you and your agent to make the right decisions on the best model to use. Install in one command, and your favorite agent can answer “which model is the best at coding without bankrupting me” with the most up-to-date data Artificial Analysis, Design Arena, and OpenRouter’s own model rankings. Hint: it’s GLM-5.2.

Connect now | Docs

Install in one command

Claude Code:

claude mcp add --transport http openrouter https://mcp.openrouter.ai/mcp
claude mcp login openrouter

Codex CLI:

codex mcp add openrouter --url https://mcp.openrouter.ai/mcp
codex mcp login openrouter

Cursor: Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "openrouter": { "url": "https://mcp.openrouter.ai/mcp" }
  }
}

See the connect guide for OAuth login and every supported client.

Pick the right model without tab-switching

You’re building a feature that needs structured JSON output, and you want a model that’s fast, cheap, and actually good at it. Normally you’d open the OpenRouter website, browse the model list, compare benchmarks, check pricing, maybe run a few test prompts in the playground. That’s 15 minutes of context-switching before you write a single line of integration code.

With the MCP server connected, you can do this all in your coding agent:

You: "I need a model for structured JSON extraction from legal documents.
     Fast, under $1/M input tokens, good at following schemas."

Agent: [calls models-list with filters] → [calls benchmarks] → [calls model-endpoints]

Agent: "google/gemini-3-flash-preview fits well: $0.10/M input,
       138k context, strong structured output support. Here's the
       endpoint with the lowest latency..."

The agent pulls from the live model catalog, cross-references Artificial Analysis intelligence scores and Design Arena ELO rankings, and checks per-provider pricing and latency. The recommendation is fully grounded in current data, not whatever was true when the model was last trained 6 months ago.

Test before you commit

chat-send lets your agent fire off a test prompt to any model and see the response, cost, and which provider served it. Your agent can compare answers across models side-by-side:

You: "Compare how Claude Opus 4.8, GPT-5.5, and DeepSeek V4 Pro
     handle this structured extraction prompt."

Agent: [sends the same prompt to all three via chat-send]
       [calls generation-get for each to get cost breakdowns]

Agent: "All three produced valid JSON. Opus 4.8 nailed the edge
       case in row 12. GPT-5.5 was 40% cheaper. DeepSeek V4 Pro
       was fastest at 180ms TTFB."

Model slugs support suffixes: :online for web search, :nitro for speed, :floor for the lowest price, :free for free endpoints. Your agent can test across variants without you memorizing the syntax.

Search the docs without leaving your editor

Your agent has docs-search, which runs a full-text search across the OpenRouter documentation. “How do I pin a model to a specific provider?” “What’s the format for tool calling?” “How does prompt caching work?” Your agent finds the answer and applies it, all in one flow.

This is where the MCP server earns its keep as a development assistant. Your agent can look up the exact API parameter it needs, check the right request format, and wire it into your code without you having to find and read the docs page yourself.

A dedicated, capped key

The server is remote (nothing installed locally), and the first login runs an OAuth flow that mints a dedicated API key with a 7-day expiry and a $10 spend cap (editable on the approval screen). It’s separate from your other keys and shows up on your keys dashboard. You can revoke it any time.

See the connect guide for setup in OpenCode, Claude Desktop, and every other supported client.

What’s in the toolbox

ToolWhat it does
models-listSearch the live model catalog with filters: price range, context length, modality, provider, model family, and more
model-getFull details for one model: capabilities, pricing, context window, supported parameters
model-endpointsPer-provider breakdown: price, latency, throughput, data policy
benchmarksThird-party quality scores from Artificial Analysis and Design Arena
rankings-dailyWhich models are most used and trending by token volume
chat-sendSend a test prompt to any model, get the response and cost
generation-getCost, token counts, and serving provider for a specific generation
docs-searchFull-text search across OpenRouter docs
credits-getYour remaining account credit
providers-listAvailable providers for routing preferences
app-rankingsWhich apps drive the most OpenRouter traffic, by category

All tools except chat-send are read-only lookups. chat-send makes a billable inference call using your MCP key’s balance.

FAQ

Does this replace the OpenRouter API?

No. The MCP server is a development assistant for your coding agent. It pulls live OpenRouter data and can send test messages so your agent makes informed decisions while you build. Your app should still call the OpenRouter API directly.

How does authentication work?

Your MCP client triggers an OAuth flow that opens an OpenRouter consent page in your browser. You approve a dedicated API key with a 7-day expiry and a $10 spend cap. The key is separate from your other keys and can be disconnected anytime from your dashboard.

Does my source code get sent anywhere?

No. The tools are read-only lookups against the OpenRouter API. The only exception is chat-send, which sends the message you explicitly pass to it to a model. No source code leaves your machine unless you include it in a chat-send call.


Try it now: connect your agent and ask “what’s the best model for my use case?”