Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

serving-the-api

Sterne5

Forks0

Aktualisiert25. Juni 2026 um 12:51

Use when the user wants a long-running HTTP service for scrape/crawl/map instead of one-shot CLI calls or the MCP server — for example wiring crawlberg into other apps over REST. Covers `crawlberg serve`, the Firecrawl-v1-compatible endpoints, `--host`/`--port`, and when to prefer it.

Installation

Mit Codex oder Claude installieren Kopieren Sie diesen Prompt, fügen Sie ihn in Codex, Claude oder einen anderen Assistant ein und lassen Sie die Skill-Seite prüfen und installieren.

In Manus ausführen

Quelle

xberg-io

xberg-io/plugins

GitHub-Repository öffnen Creator-Repositorys ansehen

Download

In Manus ausführen

SKILL.md

readonly

name	serving-the-api
description	Use when the user wants a long-running HTTP service for scrape/crawl/map instead of one-shot CLI calls or the MCP server — for example wiring crawlberg into other apps over REST. Covers `crawlberg serve`, the Firecrawl-v1-compatible endpoints, `--host`/`--port`, and when to prefer it.

Serving the API

crawlberg serve starts a long-running REST API server exposing the crawl engine over HTTP. The endpoints are Firecrawl v1-compatible, so existing Firecrawl clients and SDKs can point at a local crawlberg instance.

This subcommand requires the binary to be built with the optional api feature (non-default). The Homebrew tap is built with all features, so serve is available there; a from-source build needs cargo install --git https://github.com/xberg-io/crawlberg crawlberg-cli --features api (or --features all) to include it.

Quick recipe

crawlberg serve --host 127.0.0.1 --port 3000

It logs Starting REST API server on <host>:<port> to stderr and blocks until terminated. Hit GET /health to confirm it is up.

Flags

Flag	Default	Purpose
`--host`	`0.0.0.0`	Address to bind to.
`--port`	`3000`	Port to listen on.

serve does not take the per-request crawl flags — it boots with the default CrawlConfig and accepts per-request options in each HTTP request body.

Endpoints

Firecrawl v1 surface:

Method + path	Purpose
`POST /v1/scrape`	Scrape a single URL (synchronous).
`POST /v1/crawl`	Start an asynchronous crawl job → returns a job `id`.
`GET /v1/crawl/{id}`	Poll crawl job status and collect results.
`DELETE /v1/crawl/{id}`	Cancel a running crawl job.
`POST /v1/map`	Discover URLs on a site (synchronous).
`POST /v1/batch/scrape`	Start an asynchronous batch-scrape job → job `id`.
`GET /v1/batch/scrape/{id}`	Poll batch-scrape job status and results.
`POST /v1/download`	Download a document from a URL.

Operational:

Method + path	Purpose
`GET /health`	Health check (`{ status, version }`).
`GET /version`	Version information.
`GET /openapi.json`	OpenAPI schema for the whole surface.

Request bodies use camelCase fields. Scrape/crawl/map mirror the CLI: url, formats, onlyMainContent, includeTags/excludeTags, timeout for scrape; url, maxDepth, maxPages, includePaths/excludePaths for crawl; url, limit, search for map. Responses wrap data as { success, data | error }; async jobs return { success, id } and are polled for { status, total, completed, data, error }.

Examples

Scrape via HTTP

curl -s http://127.0.0.1:3000/v1/scrape \
  -H 'content-type: application/json' \
  -d '{"url":"https://example.com","formats":["markdown"]}' | jq '.data'

Start a crawl job and poll it

id=$(curl -s http://127.0.0.1:3000/v1/crawl \
  -H 'content-type: application/json' \
  -d '{"url":"https://example.com","maxDepth":2,"maxPages":100}' | jq -r '.id')

curl -s "http://127.0.0.1:3000/v1/crawl/$id" | jq '{status,completed,total}'

Map URLs

curl -s http://127.0.0.1:3000/v1/map \
  -H 'content-type: application/json' \
  -d '{"url":"https://example.com","limit":500,"search":"docs"}' | jq '.data'

Operational notes

The server accepts CORS from any origin and ships no built-in authentication — do not expose it on a public interface. Bind to 127.0.0.1 for local use, or put it behind a reverse proxy / gateway that enforces auth and rate limiting before exposing it more widely.
Request bodies are capped at 10 MB and each request times out after 5 minutes.
Crawl and batch-scrape are asynchronous: the POST returns a job id immediately and you poll the {id} endpoint until status is completed, failed, or cancelled.

When to run serve vs. CLI vs. MCP

CLI (scrape/crawl/map/interact) — one-shot jobs, shell pipelines, scripting. No long-running process.
MCP (crawlberg mcp) — driving the crawler from an AI agent harness (Claude Code, Codex, Cursor, Gemini, opencode). Auto-registered by this plugin; prefer it inside an agent.
serve (crawlberg serve) — a persistent HTTP service for other applications or non-agent clients that speak REST, or when you want Firecrawl-v1 compatibility. Choose it when many callers need the crawler over the network rather than one agent or one shell.

Mehr aus diesem Repository

gleiches Repository

automating-the-browser

xberg-io/plugins

Use when extracting a page needs scripted interaction first — click, type, press a key, scroll, wait, screenshot, or run JS before capturing the DOM. Covers `crawlberg interact <url> --actions` with the real action schema, result shape, limits, and external-CDP options.

2026-06-255

crawlberg

xberg-io/plugins

Crawl, scrape, and convert websites to Markdown using the local crawlberg CLI and its MCP server. Use when the user wants to fetch a page, follow links across a domain, enumerate URLs, or drive a real browser. Covers installation, the subcommands (scrape, crawl, map, interact, mcp, serve), output formats (JSON + Markdown), browser fallback, and when to prefer the MCP server over shelling out.

2026-06-255

crawling-a-site

xberg-io/plugins

Use when the user wants to follow links across a domain and capture every reachable page as Markdown. Covers `crawlberg crawl` with depth, page caps, concurrency, rate limiting, domain scoping, robots, and output selection.

2026-06-255

headless-fallback

xberg-io/plugins

Use when a static fetch returns nothing useful and the page needs a real browser. Covers `--browser-mode auto|always|never`, external CDP via `--browser-endpoint`, symptoms of JS-only pages and WAF blocks, and the performance cost.

2026-06-255

mapping-urls

xberg-io/plugins

Use when the user wants the list of URLs on a site rather than the page content — sitemap analysis, link planning, or seeding another tool. Covers `crawlberg map <url>` with `--limit`, `--search`, robots, output, and how it differs from a full crawl.

2026-06-255

scraping-html-to-markdown

xberg-io/plugins

Use when the user wants a single page rendered as clean Markdown plus structured metadata. Covers `crawlberg scrape <url>`, JSON vs Markdown output, what metadata is returned, and how to handle JS-heavy pages.

2026-06-255

name	serving-the-api
description	Use when the user wants a long-running HTTP service for scrape/crawl/map instead of one-shot CLI calls or the MCP server — for example wiring crawlberg into other apps over REST. Covers `crawlberg serve`, the Firecrawl-v1-compatible endpoints, `--host`/`--port`, and when to prefer it.

Serving the API

Quick recipe

crawlberg serve --host 127.0.0.1 --port 3000

It logs Starting REST API server on <host>:<port> to stderr and blocks until terminated. Hit GET /health to confirm it is up.

Flags

Flag	Default	Purpose
`--host`	`0.0.0.0`	Address to bind to.
`--port`	`3000`	Port to listen on.

serve does not take the per-request crawl flags — it boots with the default CrawlConfig and accepts per-request options in each HTTP request body.

Endpoints

Firecrawl v1 surface:

Method + path	Purpose
`POST /v1/scrape`	Scrape a single URL (synchronous).
`POST /v1/crawl`	Start an asynchronous crawl job → returns a job `id`.
`GET /v1/crawl/{id}`	Poll crawl job status and collect results.
`DELETE /v1/crawl/{id}`	Cancel a running crawl job.
`POST /v1/map`	Discover URLs on a site (synchronous).
`POST /v1/batch/scrape`	Start an asynchronous batch-scrape job → job `id`.
`GET /v1/batch/scrape/{id}`	Poll batch-scrape job status and results.
`POST /v1/download`	Download a document from a URL.

Operational:

Method + path	Purpose
`GET /health`	Health check (`{ status, version }`).
`GET /version`	Version information.
`GET /openapi.json`	OpenAPI schema for the whole surface.

Examples

Scrape via HTTP

curl -s http://127.0.0.1:3000/v1/scrape \
  -H 'content-type: application/json' \
  -d '{"url":"https://example.com","formats":["markdown"]}' | jq '.data'

Start a crawl job and poll it

id=$(curl -s http://127.0.0.1:3000/v1/crawl \
  -H 'content-type: application/json' \
  -d '{"url":"https://example.com","maxDepth":2,"maxPages":100}' | jq -r '.id')

curl -s "http://127.0.0.1:3000/v1/crawl/$id" | jq '{status,completed,total}'

Map URLs

curl -s http://127.0.0.1:3000/v1/map \
  -H 'content-type: application/json' \
  -d '{"url":"https://example.com","limit":500,"search":"docs"}' | jq '.data'

Operational notes

The server accepts CORS from any origin and ships no built-in authentication — do not expose it on a public interface. Bind to 127.0.0.1 for local use, or put it behind a reverse proxy / gateway that enforces auth and rate limiting before exposing it more widely.
Request bodies are capped at 10 MB and each request times out after 5 minutes.
Crawl and batch-scrape are asynchronous: the POST returns a job id immediately and you poll the {id} endpoint until status is completed, failed, or cancelled.

When to run serve vs. CLI vs. MCP

CLI (scrape/crawl/map/interact) — one-shot jobs, shell pipelines, scripting. No long-running process.
MCP (crawlberg mcp) — driving the crawler from an AI agent harness (Claude Code, Codex, Cursor, Gemini, opencode). Auto-registered by this plugin; prefer it inside an agent.
serve (crawlberg serve) — a persistent HTTP service for other applications or non-agent clients that speak REST, or when you want Firecrawl-v1 compatibility. Choose it when many callers need the crawler over the network rather than one agent or one shell.