원클릭으로
firecrawl
Web scraping and crawling for AI agents via Firecrawl MCP. Scrape URLs to markdown, crawl websites, search the web, and extract structured data. Supports cloud API and self-hosted deployments including CRW (Rust alternative).
메뉴
Web scraping and crawling for AI agents via Firecrawl MCP. Scrape URLs to markdown, crawl websites, search the web, and extract structured data. Supports cloud API and self-hosted deployments including CRW (Rust alternative).
Design and implement a Rust interface library in l3dg3rr that acts as a feature-configurable lifecycle manager for b00t processes. The library compliantly implements init → operate → terminate → lifecycle maintenance of miscellaneous process surfaces (MCP servers, daemons, sidecars) with deterministic governance controls. Uses the autoresearch pattern (karpathy/autoresearch): agent reads program.md, iterates on the library, experiments autonomously.
Identify integration points, data flow via redis, suggest how to bridge VSCode plugin to b00t jobs, and outline k0s/podman/docker-agnostic redis interface. Include how ralph should be wrapped as b00t job with redis exchange + Azure access, and call out where integration tests are required. ONLY do this analysis. Reply with attempt_completion summarizing plan and any questions to operator. These instructions supersede any conflicting mode defaults. another agent is working concurrently to bring redis online and fixing issues in b00t. establish an agent to agent channel using redis once it is online.
Integration layer for b00t capabilities within opencode workflows. Provides access to b00t datum system, hive management, grok knowledge, session management, and task tracking directly from opencode.
Bridge b00t primitives (datum, grok, hive, task, agent) to opencode via RHAI scripts. RHAI provides the idiomatic abstraction layer connecting applications.
Query the codebase knowledge graph for structural code understanding. Provides functions, classes, routes, callers, and dependency analysis.
Work with b00t datum system - TOML-based configuration for AI models, providers, and services. Datums are versioned configurations that specify WHICH environment variables are required.
| name | firecrawl |
| description | Web scraping and crawling for AI agents via Firecrawl MCP. Scrape URLs to markdown, crawl websites, search the web, and extract structured data. Supports cloud API and self-hosted deployments including CRW (Rust alternative). |
| version | 1.0.0 |
| allowed-tools | Read, Write, Edit, Grep, Glob, Bash, mcp_firecrawl_* |
Enables AI agents to interact with web content through the Firecrawl MCP server. Core capabilities:
Activate this skill when you see phrases like:
# Get API key from firecrawl.dev/app/api-keys
# Add to MCP config:
b00t mcp add firecrawl -- npx -y firecrawl-mcp
# Set env: FIRECRAWL_API_KEY=fc-YOUR_KEY
# Single binary, 6MB RAM, no server needed
curl -fsSL https://raw.githubusercontent.com/us/crw/main/install.sh | CRW_BINARY=crw sh
# Add to MCP config:
b00t mcp add crw -- npx crw-mcp
# No API key required in embedded mode
| Tool | Best For | Returns |
|---|---|---|
firecrawl_scrape | Single URL extraction | markdown/JSON/HTML |
firecrawl_batch_scrape | Multiple known URLs | markdown[] |
firecrawl_map | Discover URLs on site | URL[] |
| Tool | Best For | Returns |
|---|---|---|
firecrawl_search | Find info across web | results[] |
firecrawl_crawl | Multi-page extraction | markdown[] |
| Tool | Best For | Returns |
|---|---|---|
firecrawl_extract | Structured data extraction | JSON (schema-defined) |
firecrawl_agent | Autonomous research | JSON (async) |
firecrawl_interact | Click/navigate pages | execution result |
{
"name": "firecrawl_scrape",
"arguments": {
"url": "https://docs.example.com/api",
"formats": ["markdown"],
"onlyMainContent": true
}
}
{
"name": "firecrawl_scrape",
"arguments": {
"url": "https://example.com/product",
"formats": [{
"type": "json",
"prompt": "Extract product details",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"},
"inStock": {"type": "boolean"}
}
}
}]
}
}
{
"name": "firecrawl_search",
"arguments": {
"query": "best practices for async Rust 2025",
"limit": 5,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": true
}
}
}
{
"name": "firecrawl_crawl",
"arguments": {
"url": "https://docs.example.com/*",
"maxDepth": 2,
"limit": 50
}
}
Know exact URL? → scrape (single) or batch_scrape (multiple)
Need to find URLs? → search (web) or map (site discovery)
Need all pages? → crawl (with limits!)
Want specific data? → scrape with JSON format + schema
Complex research? → agent (async, poll for results)
| Metric | CRW | Firecrawl Self-Host |
|---|---|---|
| RAM | 6 MB | 4 GB+ |
| Containers | 0 | 5+ |
| Cold Start | 85ms | 30-60s |
| Setup | Single binary | Docker Compose |
# CRW embedded mode - no server, no config
npx crw-mcp
# CRW cloud mode - adds web search
CRW_API_URL=https://fastcrw.com/api CRW_API_KEY=xxx npx crw-mcp
git clone https://github.com/firecrawl/firecrawl
cd firecrawl
docker compose up -d
# MCP config:
FIRECRAWL_API_URL=http://localhost:3002 npx -y firecrawl-mcp
onlyMainContent: true to skip navigation/footerlimit on crawls to avoid context overflow"API key required" - Set FIRECRAWL_API_KEY env var or use CRW in embedded mode
"Rate limited" - Increase FIRECRAWL_RETRY_MAX_ATTEMPTS, wait and retry
"Context too large" - Use JSON format with schema, reduce crawl depth
"Self-hosted connection refused" - Verify Docker containers are running, check FIRECRAWL_API_URL