One API endpoint to route requests across hundreds of AI models

OpenRouter is a model gateway that exposes many third-party AI models through one OpenAI-compatible API. Teams can compare providers, set routing preferences, and switch models without rewriting core client logic for each vendor SDK. The service publishes per-model pricing and supports pay-as-you-go usage.

Category Developer Tools

Pricing Free tier + Pay-as-you-go

Platforms Web / API

llm-gatewayapirouting

Use cases

Benchmarking prompts across multiple model vendors with one integration
Failing over to alternate providers when one endpoint is degraded
Controlling model spend by selecting cheaper routes per task
Shipping prototypes quickly without committing to a single model vendor
Running internal eval pipelines over a shared model catalog

Key features

OpenAI-compatible API endpoint for model calls
Model catalog spanning text, image, and other modalities
Per-model pricing visibility before sending requests
Provider routing controls for latency, cost, and availability
Single-key integration that reduces per-vendor SDK wiring

Who Is It For?

Application developers shipping multi-model AI products
Platform engineers responsible for LLM reliability and cost controls
Startups that need flexible model sourcing during rapid iteration

Frequently Asked Questions

Does OpenRouter expose only one model family?: No. OpenRouter lists models from many providers and lets you call them through one API surface.
Can I use OpenAI SDK-style calls with OpenRouter?: OpenRouter documents an OpenAI-compatible API, so existing OpenAI-style client patterns can often be adapted with endpoint and key changes.
How is pricing presented?: OpenRouter publishes pricing by model on its pricing and models pages; actual cost depends on the model and token usage you select.

3 Indexed items

Groq Cloud API

Developer ToolsFree tier + Pay-as-you-go (published USD rates)

GroqCloud exposes hosted language, speech, and compound workloads through Groq’s HTTP APIs. Documentation highlights compatibility with OpenAI client libraries when you point `base_url` at Groq’s OpenAI-compatible endpoint and supply a Groq API key, alongside first-party Groq SDKs for Python and JavaScript. Pricing pages publish per-model token rates (USD) for on-demand inference.

Replicate

Developer ToolsPay-per-prediction billing + prepaid credits (see Replicate billing docs)

Replicate is a hosted platform for executing third-party and custom machine-learning models over HTTP without provisioning GPUs yourself. Official documentation explains how to authenticate with API tokens, create asynchronous predictions, stream outputs, retrieve model metadata, wire webhooks for completion events, and optionally deploy or fine-tune checkpoints (for example FLUX image workflows) published to the Replicate catalog.

Together AI

Developer ToolsUsage-based inference + optional dedicated endpoints / fine-tuning (see Together pricing docs)

Together AI operates a developer platform for running prominent open-source and vendor-weight models from Together-hosted GPUs. Documentation centers on issuing API keys, installing the Together Python (`together`) or npm (`together-ai`) SDKs, or calling HTTPS endpoints such as `https://api.together.ai/v1/chat/completions` with Bearer authentication. Guides cover streaming chat completions, function calling, structured outputs, model catalog browsing, GPU reservations for steady traffic, and fine-tuning or dedicated cluster offerings published in the broader docs hierarchy.

OpenRouter