with one click
Ollama API Documentation
npx skills add https://github.com/rawveg/skillsforge-marketplace --skill ollamaCopy and paste this command into Claude Code to install the skill
Ollama API Documentation
npx skills add https://github.com/rawveg/skillsforge-marketplace --skill ollamaCopy and paste this command into Claude Code to install the skill
Designs and executes multi-agent teams to accomplish complex tasks through iterative collaboration, quality gates, and refinement loops. Use when a user wants to accomplish any non-trivial task that would benefit from specialised agents working in sequence or parallel - e.g. writing an article, building a software feature, conducting research, producing a marketing campaign, designing a system, creating educational content, or any task that naturally decomposes into research → planning → execution → review → refinement stages. Triggers on phrases like "build me a team to...", "use agents to...", "orchestrate agents for...", or when a task is complex enough that a single agent would benefit from decomposition into specialists.
Death & Sourdough series continuity checker. MANDATORY before writing or editing ANY prose chapter for the Death & Sourdough project. Ensures cross-referencing of established facts (character details, locations, timeline, objects, quoted text, relationship dynamics) against the Continuity Bible, and updates the bible after writing. Trigger whenever: (1) writing a new chapter, (2) revising or fleshing out an existing chapter, (3) adding new characters, locations, or named details to the prose.
Create Amazon-compliant A+ Content for KDP books with text, module layouts, and image specs. Use for A+ Content creation, book detail page design, module selection, compliance checking, rejection avoidance, or KDP marketing materials.
Adds an "AI Summary Request" footer component with clickable AI platform icons (ChatGPT, Claude, Gemini, Grok, Perplexity) that pre-populate prompts for users to get AI summaries of the website. Optionally creates an llms.txt file for enhanced AI discoverability. Use when users want to add AI platform integration buttons or make their website AI-friendly.
This skill analyzes article content in-depth and generates optimized, marketable titles in the format 'Title: Subtitle' (10-12 words maximum). The skill should be used when users request title optimization, title generation, or title improvement for articles, blog posts, or written content. It generates 5 title candidates using proven formulas, evaluates them against success criteria (clickability, SEO, clarity, emotional impact, memorability, shareability), and replaces the article's title with the winning candidate.
Comprehensive assistance with RevenueCat in-app subscriptions and purchases
| name | ollama |
| description | Ollama API Documentation |
Comprehensive assistance with Ollama development - the local AI model runtime for running and interacting with large language models programmatically.
This skill should be triggered when:
Generate a simple chat response:
curl http://localhost:11434/api/chat -d '{
"model": "gemma3",
"messages": [
{
"role": "user",
"content": "Why is the sky blue?"
}
]
}'
Generate a text response from a prompt:
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "Why is the sky blue?"
}'
Use Ollama with the OpenAI Python library:
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
api_key='ollama', # required but ignored
)
chat_completion = client.chat.completions.create(
messages=[
{
'role': 'user',
'content': 'Say this is a test',
}
],
model='llama3.2',
)
Ask questions about images:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11434/v1/", api_key="ollama")
response = client.chat.completions.create(
model="llava",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": "data:image/png;base64,iVBORw0KG...",
},
],
}
],
max_tokens=300,
)
Create vector embeddings for text:
client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
embeddings = client.embeddings.create(
model="all-minilm",
input=["why is the sky blue?", "why is the grass green?"],
)
Get structured JSON responses:
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
class FriendInfo(BaseModel):
name: str
age: int
is_available: bool
class FriendList(BaseModel):
friends: list[FriendInfo]
completion = client.beta.chat.completions.parse(
temperature=0,
model="llama3.1:8b",
messages=[
{"role": "user", "content": "Return a list of friends in JSON format"}
],
response_format=FriendList,
)
friends_response = completion.choices[0].message
if friends_response.parsed:
print(friends_response.parsed)
Use Ollama with the OpenAI JavaScript library:
import OpenAI from "openai";
const openai = new OpenAI({
baseURL: "http://localhost:11434/v1/",
apiKey: "ollama", // required but ignored
});
const chatCompletion = await openai.chat.completions.create({
messages: [{ role: "user", content: "Say this is a test" }],
model: "llama3.2",
});
Sign in to use cloud models:
# Sign in from CLI
ollama signin
# Then use cloud models
ollama run gpt-oss:120b-cloud
Or use API keys for direct cloud access:
export OLLAMA_API_KEY=your_api_key
curl https://ollama.com/api/generate \
-H "Authorization: Bearer $OLLAMA_API_KEY" \
-d '{
"model": "gpt-oss:120b",
"prompt": "Why is the sky blue?",
"stream": false
}'
Set environment variables for server configuration:
macOS:
# Set environment variable
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
# Restart Ollama application
Linux (systemd):
# Edit service
systemctl edit ollama.service
# Add under [Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
# Reload and restart
systemctl daemon-reload
systemctl restart ollama
Windows:
1. Quit Ollama from task bar
2. Search "environment variables" in Settings
3. Edit or create OLLAMA_HOST variable
4. Set value: 0.0.0.0:11434
5. Restart Ollama from Start menu
Verify if your model is using GPU:
ollama ps
Output shows:
100% GPU - Fully loaded on GPU100% CPU - Fully loaded in system memory48%/52% CPU/GPU - Split between bothhttp://localhost:11434/apihttps://ollama.com/api/v1/ endpoints for OpenAI librarieshttp://localhost:11434ollama signin) or API keyhttps://ollama.com/apigemma3, llama3.2, qwen3)-cloud (e.g., gpt-oss:120b-cloud, qwen3-coder:480b-cloud)llava)OLLAMA_HOST - Change bind address (default: 127.0.0.1:11434)OLLAMA_CONTEXT_LENGTH - Context window size (default: 2048 tokens)OLLAMA_MODELS - Model storage directoryOLLAMA_ORIGINS - Allow additional web origins for CORSHTTPS_PROXY - Proxy server for model downloadsStatus Codes:
200 - Success400 - Bad Request (invalid parameters)404 - Not Found (model doesn't exist)429 - Too Many Requests (rate limit)500 - Internal Server Error502 - Bad Gateway (cloud model unreachable)Error Format:
{
"error": "the model failed to generate a response"
}
"stream": false to get complete response in one objectThis skill includes comprehensive documentation in references/:
llms-txt.md - Complete API reference covering:
/api/generate, /api/chat, /api/embed, etc.)llms.md - Documentation index listing all available topics:
Use the reference files when you need:
Start with these common patterns:
/api/generate endpoint with a prompt/api/chat with messages arraybase_url='http://localhost:11434/v1/'ollama ps to verify model loadingRead llms-txt.md section on "Introduction" and "Quickstart" for foundational concepts.
Focus on:
Check the specific API endpoints in llms-txt.md for detailed parameter options.
Explore:
Refer to platform-specific sections in llms.md and configuration details in llms-txt.md.
Building a chatbot:
/api/chat endpointCreating embeddings for search:
/api/embed endpointRunning behind a firewall:
HTTPS_PROXY environment variableUsing cloud models:
ollama signin once-cloud suffixCheck:
ollama ps
Solutions:
Problem: Ollama only accessible from localhost
Solution:
# Set OLLAMA_HOST to bind to all interfaces
export OLLAMA_HOST="0.0.0.0:11434"
See "How do I configure Ollama server?" in llms-txt.md for platform-specific instructions.
Problem: Cannot download models behind proxy
Solution:
# Set proxy (HTTPS only, not HTTP)
export HTTPS_PROXY=https://proxy.example.com
# Restart Ollama
See "How do I use Ollama behind a proxy?" in llms-txt.md.
Problem: Browser extension or web app cannot access Ollama
Solution:
# Allow specific origins
export OLLAMA_ORIGINS="chrome-extension://*,moz-extension://*"
See "How can I allow additional web origins?" in llms-txt.md.
# CLI Commands
ollama signin # Sign in to ollama.com
ollama run gemma3 # Run a model interactively
ollama pull gemma3 # Download a model
ollama ps # List running models
ollama list # List installed models
# Check API Status
curl http://localhost:11434/api/version
# Environment Variables (Common)
export OLLAMA_HOST="0.0.0.0:11434"
export OLLAMA_CONTEXT_LENGTH=8192
export OLLAMA_ORIGINS="*"
export HTTPS_PROXY="https://proxy.example.com"