ワンクリックで
firecrawl-knowledge-base
// Build a knowledge base from web content with Firecrawl. Use for local reference docs, RAG-ready chunks, fine-tuning datasets, documentation mirrors, topic corpora, or LLM-ready markdown organized from web sources.
// Build a knowledge base from web content with Firecrawl. Use for local reference docs, RAG-ready chunks, fine-tuning datasets, documentation mirrors, topic corpora, or LLM-ready markdown organized from web sources.
Extract any website's design system into an agent-ready DESIGN.md using Firecrawl scrape evidence. Use when the user wants colors, fonts, spacing, components, layout patterns, or brand/UI guidance from a website so AI agents can create new websites, clone a look, or build pages inspired by that design.
Extract structured company lists from directories with Firecrawl. Use for scraping YC, Crunchbase, Product Hunt, G2, startup directories, category directories, or custom company databases into JSON, CSV, CRM-ready lists, or research tables.
Monitor competitor pricing, features, changelogs, dashboards, and product changes with Firecrawl. Use for recurring competitive intelligence, pricing tier extraction, feature change tracking, or structured competitor alerts.
Pull metrics from analytics dashboards and internal web tools with Firecrawl browser. Use when the user needs dashboard reporting, cross-platform metric summaries, authenticated analytics extraction, date-range reports, or structured metrics from web dashboards.
Run multi-source deep research with Firecrawl. Use when the user asks to research a topic, compare perspectives, produce a sourced briefing, investigate a technical or market question, or synthesize web evidence across many sources.
Walk through a product's key flows with Firecrawl browser and produce a structured UX/product walkthrough. Use for signup, onboarding, pricing, docs, dashboard, product demo prep, UX teardown, and first-run experience analysis.
| name | firecrawl-knowledge-base |
| description | Build a knowledge base from web content with Firecrawl. Use for local reference docs, RAG-ready chunks, fine-tuning datasets, documentation mirrors, topic corpora, or LLM-ready markdown organized from web sources. |
| license | ISC |
| metadata | {"author":"firecrawl","version":"0.1.0","homepage":"https://www.firecrawl.dev","source":"https://github.com/firecrawl/firecrawl-workflows"} |
| inputs | [{"name":"FIRECRAWL_API_KEY","description":"Firecrawl API key for hosted Firecrawl requests.","required":true}] |
Use this to turn URLs or topics into organized LLM-ready content.
Infer the source, goal, depth, and output location from context. If the source and goal are clear, proceed immediately.
Ask at most 1-3 concise questions only if blocked, such as the source URL/topic, whether the output is reference/RAG/training/docs, or training format if training is requested.
Use Firecrawl map for documentation sites, search for topic-based corpora, scrape pages into markdown, and preserve code examples and tables.
For files, follow the Firecrawl download-style convention:
.firecrawl/
<hostname>/
<path>/
index.md
If appropriate, use sub-agents or equivalent parallel task runners:
index.md, and sources.json.manifest.json.training-data.jsonl and training-metadata.json.# Knowledge Base: [Source]
## Summary
[What was collected and why]
## Output Structure
[Files/directories created]
## Coverage
[Sections, source types, counts]
## Usage Notes
[How to use in RAG, docs, training, or agent context]
## Sources
[URLs collected]
## Rerun Inputs
workflow: firecrawl-knowledge-base
source: [url/topic]
goal: [reference/rag/train/docs]
depth: [quick/thorough/exhaustive]
output_dir: [.firecrawl/]