| name | free-llm-apis |
| description | Guide users through obtaining and configuring free API keys for LLM providers. Use when the user wants to set up a free LLM API, get a free API key, connect to a free model provider, configure an OpenAI-compatible endpoint at no cost, or asks about free tiers for AI models. Triggers on "free API key", "free LLM", "set up Gemini/Groq/Mistral/etc.", "which free provider", "how to get an API key", "free model access", "configure LLM for free". |
Free LLM API Setup
Help users pick a free LLM provider and configure their API key. Every provider here has a permanent free tier, no credit card needed.
Provider Selection
Ask the user what matters most, then recommend accordingly:
| Priority | Best picks |
|---|
| Highest rate limits | Groq (30 RPM, 14.4K RPD), Cerebras (30 RPM, 14.4K RPD) |
| Largest model selection | Cloudflare Workers AI (49+ models), OpenRouter (32+ models) |
| Strongest proprietary models | Google Gemini (Gemini 2.5 Pro), GitHub Models (GPT-4o) |
| Fastest inference | Groq, Cerebras (both optimized for speed) |
| Highest token budget | Mistral AI (1B tokens/month) |
| European provider | Mistral AI (EU), LLM7.io (UK) |
| No signup required | LLM7.io (basic tier works without token) |
Provider categories
Provider APIs -- run by the companies that train the models:
Inference providers -- third-party platforms hosting open-weight models:
- GitHub Models, NVIDIA NIM, Groq, Cerebras, Cloudflare Workers AI, LLM7.io, Kluster AI, OpenRouter, Hugging Face
- See references/inference-providers.md for setup instructions.
Workflow
- Ask what models, rate limits, or features the user cares about. If they already know which provider they want, skip to step 3.
- Match their priorities against the table above. Suggest 1-2 options with a short reason.
- Load the right reference file and walk through the setup steps for that provider (API key, code example, env var).
- Offer a quick test script or curl command so they can confirm the key works.
Quick Test Template
After setup, use this to verify any provider:
from openai import OpenAI
client = OpenAI(api_key="KEY", base_url="BASE_URL")
response = client.chat.completions.create(
model="MODEL_NAME",
messages=[{"role": "user", "content": "Say hello in one sentence."}],
max_tokens=50
)
print(response.choices[0].message.content)
Replace KEY, BASE_URL, and MODEL_NAME with values from the provider's setup guide.
Key Notes
- All endpoints work with the OpenAI SDK unless noted.
- RPM = requests per minute. RPD = requests per day.
- Google Gemini's free tier is blocked in the EEA, UK, and Switzerland.
- "Limits undocumented" = the provider doesn't publish rate limits.
- Don't hardcode API keys. Use environment variables or a secrets manager.