| name | firecrawl |
| description | Web scraping and crawling for AI agents via Firecrawl MCP. Scrape URLs to markdown,
crawl websites, search the web, and extract structured data. Supports cloud API and
self-hosted deployments including CRW (Rust alternative).
|
| version | 1.0.0 |
| allowed-tools | Read, Write, Edit, Grep, Glob, Bash, mcp_firecrawl_* |
What This Skill Does
Enables AI agents to interact with web content through the Firecrawl MCP server. Core capabilities:
- Scrape: Convert any URL to clean markdown, HTML, or structured JSON
- Crawl: Multi-page extraction with BFS traversal
- Search: Web search with optional content scraping
- Map: Discover all URLs on a website
- Extract: LLM-powered structured data extraction
- Agent: Autonomous multi-source research
When It Activates
Activate this skill when you see phrases like:
- "scrape this website" or "get content from this URL"
- "crawl the documentation" or "extract all pages from..."
- "search the web for..." with content extraction
- "find all URLs on this site"
- "extract structured data from this page"
- "what does this website say about..."
- "gather information from multiple pages"
Installation
Option A: Cloud (Fastest)
b00t mcp add firecrawl -- npx -y firecrawl-mcp
Option B: Self-Hosted (CRW - Recommended)
curl -fsSL https://raw.githubusercontent.com/us/crw/main/install.sh | CRW_BINARY=crw sh
b00t mcp add crw -- npx crw-mcp
Available Tools
Core Scraping
| Tool | Best For | Returns |
|---|
firecrawl_scrape | Single URL extraction | markdown/JSON/HTML |
firecrawl_batch_scrape | Multiple known URLs | markdown[] |
firecrawl_map | Discover URLs on site | URL[] |
Web Discovery
| Tool | Best For | Returns |
|---|
firecrawl_search | Find info across web | results[] |
firecrawl_crawl | Multi-page extraction | markdown[] |
Advanced
| Tool | Best For | Returns |
|---|
firecrawl_extract | Structured data extraction | JSON (schema-defined) |
firecrawl_agent | Autonomous research | JSON (async) |
firecrawl_interact | Click/navigate pages | execution result |
Usage Patterns
Quick Scrape (Markdown)
{
"name": "firecrawl_scrape",
"arguments": {
"url": "https://docs.example.com/api",
"formats": ["markdown"],
"onlyMainContent": true
}
}
Structured Extraction (JSON)
{
"name": "firecrawl_scrape",
"arguments": {
"url": "https://example.com/product",
"formats": [{
"type": "json",
"prompt": "Extract product details",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"},
"inStock": {"type": "boolean"}
}
}
}]
}
}
Web Search with Content
{
"name": "firecrawl_search",
"arguments": {
"query": "best practices for async Rust 2025",
"limit": 5,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": true
}
}
}
Site Crawling
{
"name": "firecrawl_crawl",
"arguments": {
"url": "https://docs.example.com/*",
"maxDepth": 2,
"limit": 50
}
}
Decision Guide
Know exact URL? → scrape (single) or batch_scrape (multiple)
Need to find URLs? → search (web) or map (site discovery)
Need all pages? → crawl (with limits!)
Want specific data? → scrape with JSON format + schema
Complex research? → agent (async, poll for results)
Self-Hosted Options
CRW (Recommended)
| Metric | CRW | Firecrawl Self-Host |
|---|
| RAM | 6 MB | 4 GB+ |
| Containers | 0 | 5+ |
| Cold Start | 85ms | 30-60s |
| Setup | Single binary | Docker Compose |
npx crw-mcp
CRW_API_URL=https://fastcrw.com/api CRW_API_KEY=xxx npx crw-mcp
Firecrawl Self-Hosted
git clone https://github.com/firecrawl/firecrawl
cd firecrawl
docker compose up -d
FIRECRAWL_API_URL=http://localhost:3002 npx -y firecrawl-mcp
Important Notes
Format Selection
- JSON format (preferred): Use with schema to extract only needed data
- Markdown format: Only when full content is required (articles, summaries)
Token Management
- Use
onlyMainContent: true to skip navigation/footer
- Set
limit on crawls to avoid context overflow
- Use JSON extraction instead of full markdown when possible
Rate Limits
- Built-in exponential backoff (configurable)
- Batch operations are async - poll for status
Troubleshooting
"API key required" - Set FIRECRAWL_API_KEY env var or use CRW in embedded mode
"Rate limited" - Increase FIRECRAWL_RETRY_MAX_ATTEMPTS, wait and retry
"Context too large" - Use JSON format with schema, reduce crawl depth
"Self-hosted connection refused" - Verify Docker containers are running, check FIRECRAWL_API_URL
Related Skills
- context7: Live library documentation
- browser-use: Interactive browser automation
- playwright: UI testing and scraping
References