一键导入
openrouter
OpenRouter unified API for 400+ models — chat completions, streaming, tool calling, structured output, embeddings, multimodal.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
OpenRouter unified API for 400+ models — chat completions, streaming, tool calling, structured output, embeddings, multimodal.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Create, verify, and improve AGENTS.md files. Minimal, focused, progressive disclosure. Fixes bloat, contradictions, stale info.
Audit and plan website optimisation for AI agents, AI search, LLM discoverability, llms.txt, structured data, and sitemaps.
Build distinctive, production-grade frontend interfaces (websites, components, dashboards, layouts) with polished UI design that avoids generic AI aesthetics.
Generate and validate Awesome list READMEs following sindresorhus/awesome standards.
Scaffold type-safe TypeScript projects with the Better-T-Stack CLI — new projects, features, or troubleshooting.
Build full-stack TypeScript apps with Convex — server functions, schema, auth, file storage, real-time, frontend integration, testing, deployment.
| name | openrouter |
| description | OpenRouter unified API for 400+ models — chat completions, streaming, tool calling, structured output, embeddings, multimodal. |
Expert guidance for AI agents integrating with OpenRouter API - unified access to 400+ models from 90+ providers.
When to use this skill:
Endpoint: POST https://openrouter.ai/api/v1/chat/completions
Headers (required):
{
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
// Optional: for app attribution
'HTTP-Referer': 'https://your-app.com',
'X-Title': 'Your App Name'
}
Minimal request structure:
const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'anthropic/claude-3.5-sonnet',
messages: [
{ role: 'user', content: 'Your prompt here' }
]
})
});
Non-streaming response:
{
"id": "gen-abc123",
"choices": [{
"message": {
"role": "assistant",
"content": "Response text here"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30
},
"model": "anthropic/claude-3.5-sonnet"
}
Key fields:
choices[0].message.content - The assistant's responsechoices[0].finish_reason - Why generation stopped (stop, length, tool_calls, etc.)usage - Token counts and cost informationmodel - Actual model used (may differ from requested)Use streaming (stream: true) when:
Use non-streaming when:
Streaming basics:
const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
method: 'POST',
headers: { /* ... */ },
body: JSON.stringify({
model: 'anthropic/claude-3.5-sonnet',
messages: [{ role: 'user', content: '...' }],
stream: true
})
});
for await (const chunk of response.body) {
const text = new TextDecoder().decode(chunk);
const lines = text.split('\n').filter(line => line.startsWith('data: '));
for (const line of lines) {
const data = line.slice(6); // Remove 'data: '
if (data === '[DONE]') break;
const parsed = JSON.parse(data);
const content = parsed.choices?.[0]?.delta?.content;
if (content) {
// Accumulate or display content
}
}
}
Format: provider/model-name[:variant]
Examples:
anthropic/claude-3.5-sonnet - Specific modelopenai/gpt-4o:online - With web search enabledgoogle/gemini-2.0-flash:free - Free tier variant| Variant | Use When | Tradeoffs |
|---|---|---|
:free | Cost is primary concern, testing, prototyping | Rate limits, lower quality models |
:online | Need current information, real-time data | Higher cost, web search latency |
:extended | Large context window needed | May be slower, higher cost |
:thinking | Complex reasoning, multi-step problems | Higher token usage, slower |
:nitro | Speed is critical | May have quality tradeoffs |
:exacto | Need specific provider | No fallbacks, may be less available |
General purpose: anthropic/claude-3.5-sonnet or openai/gpt-4o
Coding: anthropic/claude-3.5-sonnet or openai/gpt-4o
Complex reasoning: anthropic/claude-opus-4:thinking or openai/o3
Fast responses: openai/gpt-4o-mini:nitro or google/gemini-2.0-flash
Cost-sensitive: google/gemini-2.0-flash:free or meta-llama/llama-3.1-70b:free
Current information: anthropic/claude-3.5-sonnet:online or google/gemini-2.5-pro:online
Large context: anthropic/claude-3.5-sonnet:extended or google/gemini-2.5-pro:extended
Default behavior: OpenRouter automatically selects best provider
Explicit provider order:
{
provider: {
order: ['anthropic', 'openai', 'google'],
allow_fallbacks: true,
sort: 'price' // 'price', 'latency', or 'throughput'
}
}
When to set provider order:
Automatic fallback - try multiple models in order:
{
models: [
'anthropic/claude-3.5-sonnet',
'openai/gpt-4o',
'google/gemini-2.0-flash'
]
}
When to use fallbacks:
Fallback behavior:
model fieldmodel (string, optional)
messages (Message[], required)
{ role: 'user'|'assistant'|'system', content: string | ContentPart[] }stream (boolean, default: false)
temperature (float, 0.0-2.0, default: 1.0)
max_tokens (integer, optional)
top_p (float, 0.0-1.0, default: 1.0)
top_k (integer, 0+, default: 0/disabled)
For code generation: temperature: 0.1-0.3, top_p: 0.95
For factual responses: temperature: 0.0-0.2
For creative writing: temperature: 0.8-1.2
For brainstorming: temperature: 1.0-1.5
For chat: temperature: 0.6-0.8
tools (Tool[], default: [])
{
type: 'function',
function: {
name: 'function_name',
description: 'What it does',
parameters: { /* JSON Schema */ }
}
}
tool_choice (string | object, default: 'auto')
'auto': Model decides (default)'none': Never call tools'required': Must call a tool{ type: 'function', function: { name: 'specific_tool' } }: Force specific toolparallel_tool_calls (boolean, default: true)
false for sequential executionWhen to use tools:
response_format (object, optional)
JSON object mode:
{ type: 'json_object' }
JSON Schema mode (strict):
{
type: 'json_schema',
json_schema: {
name: 'schema_name',
strict: true,
schema: { /* JSON Schema */ }
}
}
When to use structured outputs:
Enable via model variant (simplest):
{ model: 'anthropic/claude-3.5-sonnet:online' }
Enable via plugin:
{
plugins: [{
id: 'web',
enabled: true,
max_results: 5
}]
}
When to use web search:
user (string, optional)
session_id (string, optional)
metadata (Record<string, string>, optional)
stop (string | string[], optional)
['\n\n', '###', 'END']Extract content:
const response = await fetch(/* ... */);
const data = await response.json();
const content = data.choices[0].message.content;
const finishReason = data.choices[0].finish_reason;
const usage = data.usage;
Check for tool calls:
const toolCalls = data.choices[0].message.tool_calls;
if (toolCalls) {
// Model wants to call tools
for (const toolCall of toolCalls) {
const { name, arguments: args } = toolCall.function;
const parsedArgs = JSON.parse(args);
// Execute tool...
}
}
Process SSE stream:
let fullContent = '';
const response = await fetch(/* ... */);
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n').filter(line => line.startsWith('data: '));
for (const line of lines) {
const data = line.slice(6);
if (data === '[DONE]') break;
const parsed = JSON.parse(data);
const content = parsed.choices?.[0]?.delta?.content;
if (content) {
fullContent += content;
// Process incrementally...
}
// Handle usage in final chunk
if (parsed.usage) {
console.log('Usage:', parsed.usage);
}
}
}
Handle streaming tool calls:
// Tool calls stream across multiple chunks
let currentToolCall = null;
let toolArgs = '';
for (const parsed of chunks) {
const toolCallChunk = parsed.choices?.[0]?.delta?.tool_calls?.[0];
if (toolCallChunk?.function?.name) {
currentToolCall = { id: toolCallChunk.id, ...toolCallChunk.function };
}
if (toolCallChunk?.function?.arguments) {
toolArgs += toolCallChunk.function.arguments;
}
if (parsed.choices?.[0]?.finish_reason === 'tool_calls' && currentToolCall) {
// Complete tool call
currentToolCall.arguments = toolArgs;
// Execute tool...
}
}
const { usage } = data;
console.log(`Prompt: ${usage.prompt_tokens}`);
console.log(`Completion: ${usage.completion_tokens}`);
console.log(`Total: ${usage.total_tokens}`);
// Cost (if available)
if (usage.cost) {
console.log(`Cost: $${usage.cost.toFixed(6)}`);
}
// Detailed breakdown
console.log(usage.prompt_tokens_details);
console.log(usage.completion_tokens_details);
400 Bad Request
401 Unauthorized
403 Forbidden
402 Payment Required
408 Request Timeout
429 Rate Limited
502 Bad Gateway
503 Service Unavailable
Exponential backoff:
async function requestWithRetry(url, body, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
const response = await fetch(url, body);
if (response.ok) {
return await response.json();
}
// Retry on rate limit or server errors
if (response.status === 429 || response.status >= 500) {
const delay = Math.min(1000 * Math.pow(2, attempt), 10000);
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
// Don't retry other errors
return response;
} catch (error) {
if (attempt === maxRetries - 1) throw error;
const delay = Math.min(1000 * Math.pow(2, attempt), 10000);
await new Promise(resolve => setTimeout(resolve, delay));
}
}
}
Retryable status codes: 408, 429, 502, 503 Do not retry: 400, 401, 403, 402
Use model fallbacks:
{
models: [
'anthropic/claude-3.5-sonnet', // Primary
'openai/gpt-4o', // Fallback 1
'google/gemini-2.0-flash' // Fallback 2
]
}
Handle partial failures:
Good use cases:
Implementation pattern:
tools arraytool_calls present in responseSee: references/ADVANCED_PATTERNS.md for complete agentic loop implementation
Good use cases:
Implementation pattern:
response_format: { type: 'json_schema', json_schema: { ... } }Add response healing for robustness:
{
response_format: { /* ... */ },
plugins: [{ id: 'response-healing' }]
}
Good use cases:
Simple implementation (variant):
{
model: 'anthropic/claude-3.5-sonnet:online'
}
Advanced implementation (plugin):
{
model: 'openrouter.ai/auto',
plugins: [{
id: 'web',
enabled: true,
max_results: 5,
engine: 'exa' // or 'native'
}]
}
Images (vision):
openai/gpt-4o, anthropic/claude-3.5-sonnet, google/gemini-2.5-proAudio:
Video:
PDFs:
file-parser pluginImplementation: See references/ADVANCED_PATTERNS.md for multimodal patterns
Start with: anthropic/claude-3.5-sonnet or openai/gpt-4o
Switch based on needs:
openai/gpt-4o-mini:nitro or google/gemini-2.0-flashanthropic/claude-opus-4:thinking:online variant:extended variant:free variant{
model: 'anthropic/claude-3.5-sonnet',
messages: [...],
temperature: 0.6, // Balanced creativity
max_tokens: 1000, // Reasonable length
top_p: 0.95 // Common for quality
}
Adjust based on task:
temperature: 0.2temperature: 1.0temperature: 0.0-0.3Always prefer streaming when:
Use non-streaming when:
Tools: Enable when you need external data or actions Structured outputs: Enable when response format matters Web search: Enable when current information needed Streaming: Enable for user-facing, real-time responses Model fallbacks: Enable when reliability critical Provider routing: Enable when you have preferences or constraints
Use free models for:
Use routing to optimize:
{
provider: {
order: ['openai', 'anthropic'],
sort: 'price', // Optimize for cost
allow_fallbacks: true
}
}
Set max_tokens to prevent runaway responses
Use caching via user and session_id parameters
Enable prompt caching when supported
Reduce latency:
:nitro variants for speeduser ID for caching benefitsIncrease throughput:
sort: 'throughput'Optimize for specific metrics:
{
provider: {
sort: 'latency' // or 'price' or 'throughput'
}
}
For detailed reference information, consult:
File: references/PARAMETERS.md
File: references/ERROR_CODES.md
File: references/MODEL_SELECTION.md
File: references/ROUTING_STRATEGIES.md
File: references/ADVANCED_PATTERNS.md
File: references/EXAMPLES.md
Directory: templates/
basic-request.ts - Minimal working requeststreaming-request.ts - SSE streaming with cancellationtool-calling.ts - Complete agentic loop with toolsstructured-output.ts - JSON Schema enforcementerror-handling.ts - Robust retry logic{
model: 'anthropic/claude-3.5-sonnet',
messages: [{ role: 'user', content: 'Your prompt' }]
}
{
model: 'anthropic/claude-3.5-sonnet',
messages: [{ role: 'user', content: '...' }],
stream: true
}
{
model: 'anthropic/claude-3.5-sonnet',
messages: [{ role: 'user', content: '...' }],
tools: [{ type: 'function', function: { name, description, parameters } }],
tool_choice: 'auto'
}
{
model: 'anthropic/claude-3.5-sonnet',
messages: [{ role: 'system', content: 'Output JSON only...' }],
response_format: { type: 'json_object' }
}
{
model: 'anthropic/claude-3.5-sonnet:online',
messages: [{ role: 'user', content: '...' }]
}
{
models: ['anthropic/claude-3.5-sonnet', 'openai/gpt-4o'],
messages: [{ role: 'user', content: '...' }]
}
Remember: OpenRouter is OpenAI-compatible. Use the OpenAI SDK with baseURL: 'https://openrouter.ai/api/v1' for a familiar experience.