Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

nimble-crawl-reference

Reference for nimble crawl command. Load when bulk-crawling many pages asynchronously. Contains: async workflow (create → status → tasks results), all flags, polling guidelines, CRITICAL: use task_id (not crawl_id) for results, crawl vs map comparison.

In Manus ausführen

Sterne47

Forks15

Aktualisiert30. März 2026 um 09:38

Quelle

Nimbleway

Nimbleway/agent-skills

GitHub-Repository öffnen Creator-Repositorys ansehen

Installationsbefehl

Download

In Manus ausführen

Verwandte BerufeSOC

Basierend auf der SOC-Berufsklassifikation

SoftwareentwicklerInformatik- und Mathematikberufe·SOC 15-1252

SKILL.md

readonly

Mehr aus diesem Repository

gleiches Repository

nimble-databricks-data-products

Nimbleway/agent-skills

Builds Databricks data products from live web data, end to end: discovers the right Nimble web-data agents, scrapes into Delta tables, and produces an AI/BI dashboard and/or a deployed Databricks App — a table → dashboard → app workflow, for production data products or quick demos. Use whenever a request pairs live or scraped web data WITH a Databricks destination — e.g. "scrape Amazon/Walmart prices into a Delta table and build a dashboard", "load Zillow/Instagram/Maps/search results into Databricks and build a dashboard or app", "showcase Nimble + Databricks to a prospect". Prefer it over nimble-web-expert or competitor-intel when the data lands in Databricks. Do NOT use for one-off web fetches or CSV exports with no Databricks destination — use nimble-web-expert instead. Do NOT use for competitor or company research briefings — use competitor-intel or company-deep-dive instead. Do NOT use for generic Databricks work with no Nimble/web-data angle — use the official databricks-* skills instead.

2026-06-1547

company-deep-dive

Nimbleway/agent-skills

Use this skill ANY TIME the user asks about a specific company. Triggers: "tell me about [company]", "research [company]", "what does [company] do", "who is [company]", "look up [company]", "company deep dive", "due diligence on [company]", "background on [company]", "dig into [company]", "analyze [company]", or evaluating a company for investment, partnership, or sales. MUST be used instead of answering from memory — fetches real-time web data (funding, leadership changes, product launches, news) your training data lacks. Use even for well-known companies. Produces a sourced 360° report covering funding, leadership, product/tech, market position, news, and strategic outlook with dates and URLs. Do NOT use for multi-company competitor monitoring (use competitor-intel) or meeting prep with attendees (use meeting-prep).

2026-06-1447

competitor-intel

Nimbleway/agent-skills

Searches the live web via Nimble APIs to monitor competitors and produce a structured intelligence briefing. Runs parallel searches for news, product launches, hiring signals, and funding — then compares against previous findings to highlight only what's new. Use this skill when the user asks about competitors, competitive intelligence, or what rival companies are doing. Common triggers: "what are my competitors doing", "competitor update", "competitor news", "competitive landscape", "market intel", "what's new with [company]", "track [company]", "competitor briefing", "who's making moves", "competitive analysis", "losing deals to [company]", "battlecard". Also use before board meetings or strategy sessions when the user wants competitive context. Requires the Nimble CLI (nimble search, nimble extract) for live web data. Do NOT use for single-company deep dives (use company-deep-dive), meeting prep with attendees (use meeting-prep), or non-business queries.

2026-06-1447

market-finder

Nimbleway/agent-skills

Discovers all businesses of a given type in any geography using Nimble WSAs. Two modes: Discovery finds businesses from scratch; Audit compares a user's existing list (Google Sheet, CSV, inline) against fresh discovery, categorizing entries as matched, discovered-only, or reference-only. Vertical presets (Healthcare, SaaS, Restaurants, Legal, Auto/Home) auto-select WSA routing. Triggers: "find all X in Y", "build a list of", "market sizing", "account universe", "how many X in Y", "TAM for", "discover all", "audit my list", "compare against", "what am I missing", "gap analysis", "verify my business list", "prospect list". Do NOT use for competitor monitoring — use competitor-intel instead. Do NOT use for company deep dives — use company-deep-dive instead. Do NOT use for neighborhood-level exploration with social enrichment — use local-places instead.

2026-06-1447

healthcare-providers-enrich

Nimbleway/agent-skills

Fills gaps in existing healthcare practitioner lists — adds missing phone numbers, credentials, specialties, contact info, education, reviews, and regulatory data. Triggers: "enrich my provider list", "fill in missing data", "add phone numbers to these doctors", "complete this practitioner database", "enrich CRM export", "fill gaps in my provider data", "supplement this healthcare list". Accepts CSV, Google Sheet URL, or pasted data. Searches for each provider's practice website, extracts missing fields, and enriches with reviews, clinical trials, and accreditation via WSAs. Do NOT use for extracting providers from practice URLs — use healthcare-providers-extract instead. Do NOT use for validating credentials — use healthcare-providers-verify instead. Do NOT use for discovering practices — use market-finder or local-places instead. Do NOT use for general extraction — use nimble-web-expert instead.

2026-06-1447

healthcare-providers-extract

Nimbleway/agent-skills

Extracts structured practitioner data from healthcare practice websites. Returns names, credentials, specialties, contact info, and education for every provider on a practice's site. Use when user asks to extract, pull, or list doctors, providers, or staff from practice websites. Triggers: "extract doctors from", "pull providers from", "who are the providers at", "build a provider database", "list all doctors at", "scrape the team page", "get practitioner data from". Accepts practice URLs (pasted, CSV, Google Sheet) or discovers practices via Google Maps when given specialty + location. Single sites or 100+ URLs. Do NOT use for filling data gaps — use healthcare-providers-enrich instead. Do NOT use for credential validation — use healthcare-providers-verify instead. Do NOT use for discovering practices — use market-finder or local-places instead. Do NOT use for general extraction — use nimble-web-expert instead.

2026-06-1447

name	nimble-crawl-reference
description	Reference for nimble crawl command. Load when bulk-crawling many pages asynchronously. Contains: async workflow (create → status → tasks results), all flags, polling guidelines, CRITICAL: use task_id (not crawl_id) for results, crawl vs map comparison.

nimble crawl — reference

Async bulk crawling — starts a crawl job, returns a crawl_id, then you poll for status and retrieve per-page content via individual task_ids.

Tip: For LLM use, prefer nimble map + nimble extract --format markdown (faster, cleaner). Use crawl when you need raw HTML archives of many pages at once.

1. Create crawl (async)
2. Get crawl status
3. List crawls
4. Cancel crawl
Polling guidelines
Retrieving page content
Crawl vs Map

1. Create crawl (async)

Parameters:

Parameter	CLI flag	Type	Default	Description
`url`	`--url`	string	required	Starting URL
`limit`	`--limit`	int	`5000`	Max pages (1–10,000) — always set this
`name`	`--name`	string	—	Label for tracking
`sitemap`	`--sitemap`	string	`include`	`include` \| `only` \| `skip`
`include_path`	`--include-path`	string[]	—	URL path regex to include (repeatable)
`exclude_path`	`--exclude-path`	string[]	—	URL path regex to exclude (repeatable)
`max_discovery_depth`	`--max-discovery-depth`	int	`5`	Max link depth from start (1–20)
`crawl_entire_domain`	`--crawl-entire-domain`	bool	`false`	Follow sibling/parent paths
`allow_subdomains`	`--allow-subdomains`	bool	`false`	Follow links to subdomains
`allow_external_links`	`--allow-external-links`	bool	`false`	Follow links to external domains
`ignore_query_parameters`	`--ignore-query-parameters`	bool	`false`	Deduplicate query-param variants
`callback`	`--callback`	string	—	Webhook URL for real-time page notifications
`extract_options`	`--extract-options`	object	—	Extraction config applied to each page

CLI:

nimble crawl run --url "https://docs.example.com" --limit 50 --name "docs-crawl"

Python SDK:

from nimble_python import Nimble
nimble = Nimble(api_key=os.environ["NIMBLE_API_KEY"])

resp = nimble.crawl.run(url="https://docs.example.com", limit=50, name="docs-crawl")
crawl_id = resp.crawl_id

Response fields: crawl_id, status (initial: queued), url, name, crawl_options, created_at

Crawl status values: queued → running → succeeded / failed / canceled

2. Get crawl status

Parameters:

Parameter	CLI flag	Type	Description
`id`	`--id`	string	`crawl_id` (required)

CLI:

nimble crawl status --id "abc-123"

Python SDK:

crawl = nimble.crawl.status(id=crawl_id)
print(crawl.status, crawl.completed, crawl.total)

Response adds: total, pending, completed, failed, tasks[]

Field	Description
`tasks[].task_id`	Use with `nimble tasks results` to fetch page content
`tasks[].status`	`pending` / `processing` / `completed` / `failed`

CRITICAL: Use tasks[].task_id — NOT crawl_id — to fetch page content. Using crawl_id with tasks returns 404.

3. List crawls

Parameters:

Parameter	CLI flag	Type	Description
`status`	`--status`	string	Filter: `queued` \| `running` \| `completed` \| `failed` \| `canceled`
`limit`	`--limit`	int	Results per page
`cursor`	`--cursor`	string	Pagination cursor

CLI:

nimble crawl list --status running

Python SDK:

result = nimble.crawl.list()
# result.data = list of crawl objects
# result.pagination = { has_next, next_cursor, total }

4. Cancel crawl

Parameters:

Parameter	CLI flag	Type	Description
`id`	`--id`	string	`crawl_id` (required)

CLI:

nimble crawl terminate --id "abc-123"

Python SDK:

nimble.crawl.terminate(id=crawl_id)
# Returns: {"status": "canceled"}

Polling guidelines

CRAWL_ID=$(nimble crawl run --url "https://example.com" --limit 100 --name "my-crawl" \
  | python3 -c "import json,sys; print(json.load(sys.stdin)['crawl_id'])")

while true; do
  STATUS=$(nimble crawl status --id "$CRAWL_ID" \
    | python3 -c "import json,sys; print(json.load(sys.stdin).get('status',''))")
  echo "Status: $STATUS"
  [ "$STATUS" = "succeeded" ] || [ "$STATUS" = "failed" ] || [ "$STATUS" = "canceled" ] && break
  sleep 30
done

Crawl size	Poll interval
< 50 pages	every 15–30s
50–500 pages	every 30–60s
500+ pages	every 60–120s

Retrieving page content

After crawl succeeded, fetch each completed task — see nimble-tasks reference:

nimble tasks results --task-id "task-456"

Crawl vs Map

	`map`	`crawl`
Speed	Seconds (sync)	Minutes (async)
Output	URL list with metadata	Full page content per URL
Best for	Discover URLs, then selectively extract	Archive all pages at once
LLM use	✅ Combine with `extract --format markdown`	⚠️ Returns raw HTML

nimble-crawl-reference

nimble crawl — reference

Table of Contents

1. Create crawl (async)

2. Get crawl status

3. List crawls

4. Cancel crawl

Polling guidelines

Retrieving page content

Crawl vs Map

nimble crawl — reference

Table of Contents

1. Create crawl (async)

2. Get crawl status

3. List crawls

4. Cancel crawl

Polling guidelines

Retrieving page content

Crawl vs Map