ワンクリックで
site-extract
// Extract operator-grade source facts from a website or manual brief: claims, proof, CTAs, mechanism language, trust cues, visual patterns, and page structure with provenance. First stage of Page Factory.
// Extract operator-grade source facts from a website or manual brief: claims, proof, CTAs, mechanism language, trust cues, visual patterns, and page structure with provenance. First stage of Page Factory.
Top-level operator skill for Landing Page Factory. Classifies requests, chooses stage order, enforces prerequisites and hard stops, manages variant pages, reruns changed stages, and handles page-package admin.
Build a deployable landing page from strategy, copy, visuals, and brand profile using category-aware routing, preservation hierarchy, and explicit QA. Single-file HTML output.
Final landing-page QA pass against strategy, proof provenance, mechanism clarity, trust preservation, accessibility, and anti-slop. Gates whether a page is shippable or draft-only.
Plan and generate visuals by preservation class: exact product, branded environment, or concept support. Uses strategy-driven shot selection and visual QA instead of generic vibe prompting.
Write mechanism-preserving, proof-disciplined landing page copy from strategy, brand profile, and extract language bank. Includes mandatory sharpness audit and claim control.
Build an evidence-backed brand profile from site extract and page strategy. Separates observed facts from synthesis so downstream skills preserve the brand instead of rewriting it.
| name | site-extract |
| description | Extract operator-grade source facts from a website or manual brief: claims, proof, CTAs, mechanism language, trust cues, visual patterns, and page structure with provenance. First stage of Page Factory. |
| metadata | {"openclaw":{"emoji":"🔍","user-invocable":true,"requires":{"env":["FIRECRAWL_API_KEY"]}}} |
Use this when a user gives a site URL and wants a landing page built without a bloated intake process.
This stage does not synthesize. It extracts.
If you cannot trace a claim, proof point, or trust cue back to the source, label it missing or inferred. Do not quietly upgrade it into fact.
--deepBuild a source record that downstream skills can trust.
That means extracting:
Write:
workspace/brand/extract.mdworkspace/brand/extract.jsonworkspace/brand/palette.json# Brand Extract: [Brand]
Source URLs:
- [url 1]
- [url 2]
Extracted: [date]
Confidence: high | medium | low
## 1. Brand + category
- Brand name:
- Product category:
- Primary offer:
- Business model:
- Geography / market hints:
## 2. Exact hero language
| Element | Exact text | Source URL | Provenance |
|---|---|---|---|
| Headline | | | observed |
| Subheadline | | | observed |
| Primary CTA | | | observed |
| Secondary CTA | | | observed |
## 3. Claim inventory
| Claim | Type | Source URL | Section | Provenance | Proof required? |
|---|---|---|---|---|---|
| | benefit / mechanism / superiority / social proof / guarantee | | | observed / inferred | yes / no |
## 4. Proof inventory
| Proof item | Class | Exact wording / value | Source URL | Section | Notes |
|---|---|---|---|---|---|
| | testimonial / stat / review count / logo / press / guarantee | verified_source / derived_source / missing | | | | |
## 5. CTA inventory
| CTA text | Destination guess | Source URL | Above fold? | Notes |
|---|---|---|---|---|
## 6. Mechanism language inventory
- Unique mechanism phrases repeated by the brand
- Product explanation phrases worth preserving
- Failed-alternative language
- Objection-handling language
## 7. Trust cue inventory
- guarantees
- shipping / returns / warranty
- certifications / press / logos
- compliance / safety / authority cues
- category conventions that signal legitimacy
## 8. Page pattern inventory
| Section order | What appears | What it is doing | Worth preserving? |
|---|---|---|---|
## 9. Visual motif inventory
- dominant colors
- font families / classes
- product photography style
- UI / screenshot style if present
- shape language
- density / spacing feel
- trust-critical design conventions
## 10. Audience signals
- who this appears to target
- awareness stage clues
- sophistication clues
- explicit pain language
- explicit desire language
## 11. Missing / weak areas
- missing mechanism details
- unsupported claims
- thin proof
- inaccessible pages
- anything requiring operator decision
extract.json must include machine-readable keys for:
brandcategoryofferhero_languageclaims[]proof_inventory[]cta_inventory[]mechanism_language[]trust_cues[]page_patterns[]visual_motifs[]audience_signals[]missing_flags[]confidenceProvider selection is mandatory here.
If FIRECRAWL_API_KEY is available, you must use Firecrawl for site extraction.
Do not silently substitute generic browsing, WebFetch, or single-page scraping when Firecrawl is configured or expected.
If Firecrawl is blocked by the runtime, unavailable, or the key is missing:
Only use downgraded extraction without asking when the user has already explicitly approved that downgrade.
When Firecrawl is available, use the v2 API with the branding format:
curl -s -X POST "https://api.firecrawl.dev/v2/scrape" \
-H "Authorization: Bearer $FIRECRAWL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"formats": ["branding", "markdown", "screenshot"],
"onlyMainContent": false,
"maxAge": 172800000
}'
The branding format returns structured brand identity:
The markdown format provides page content for claim and proof extraction.
The screenshot format provides a visual reference (URL expires after 24h).
Use scripts/firecrawl-extract.sh to run this automatically and save all outputs.
When --deep is used:
Default deep priority:
If Firecrawl or scraping fails:
If Firecrawl is blocked by Cowork or another sandbox proxy:
api.firecrawl.dev must be allowed by the runtimeContinue if:
Downgrade if:
Stop and raise operator if:
After extraction, run /page-strategy.
Do not jump straight to copy unless the strategy artifact already exists.