Run any Skill in Manus with one click

clean-content-fetch

获取干净、可读的网页正文内容，适合现代网页、博客、新闻、公告和微信公众号文章抓取；支持网页正文提取、内容清洗、去噪、Markdown 输出，适用于普通 fetch 效果不佳、页面噪音较多或动态渲染干扰的场景。Clean content fetch for modern web pages, article extraction, WeChat article capture, content cleanup, noise reduction, and markdown output when ordinary fetch is not clean enough.

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/LeoYeAI/openclaw-master-skills --skill clean-content-fetch

Copy and paste this command into Claude Code to install the skill

Source

LeoYeAI/openclaw-master-skills

Stars2,010

Forks306

UpdatedMarch 9, 2026 at 02:05

SKILL.md

readonly

name

clean-content-fetch

description

Scrapling Web Fetch

当用户要获取网页内容、正文提取、把网页转成 markdown/text、抓取文章主体时，优先使用此技能。

默认流程

使用 python3 scripts/scrapling_fetch.py <url> <max_chars>
默认正文选择器优先级：
- article
- main
- .post-content
- [class*="body"]
命中正文后，使用 html2text 转 Markdown
若都未命中，回退到 body
最终按 max_chars 截断输出

用法

python3 /Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/scrapling_fetch.py <url> 30000

依赖

优先检查：

scrapling
html2text
curl_cffi
playwright
browserforge

推荐使用独立虚拟环境，避免系统 Python 的 PEP 668 限制：

python3 -m venv /Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch
/Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch/bin/pip install scrapling html2text curl_cffi playwright browserforge
/Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch/bin/python -m playwright install chromium

如直接运行脚本，优先使用该虚拟环境中的 Python：

/Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch/bin/python /Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/scrapling_fetch.py <url> 30000

输出约定

脚本默认输出 Markdown 正文内容。如需结构化输出，可追加 --json。如需调试提取命中了哪个 selector，可查看 stderr 输出。

附加资源

用法参考：/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/references/usage.md
选择器策略：/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/references/selectors.md
统一入口：/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/fetch-web-content

何时用这个技能

获取文章正文
抓博客/新闻/公告正文
将网页转成 Markdown 供后续总结
常规 fetch 效果差，希望提升现代网页抓取稳定性
抓小红书分享短链或笔记落地页正文

小红书抓取方法

对于 xhslink.com 短链或小红书笔记页，推荐直接使用虚拟环境中的脚本运行：

/Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch/bin/python /Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/scrapling_fetch.py 'http://xhslink.com/o/9745hugimlD' 30000

说明：

脚本会先解析短链并抓取落地页正文
适合提取小红书笔记文案、标题和主体内容
若页面需要更复杂交互，再切到浏览器自动化

何时不用

需要完整浏览器交互、点击、登录、翻页时：改用浏览器自动化
只是简单获取 API JSON：直接请求 API 更合适

More from this repository

same repository

openclaw-master-skills

LeoYeAI/openclaw-master-skills

A curated collection of 1609+ best OpenClaw skills — AI tools, productivity, marketing, frontend, mobile, backend, DevOps and more. Weekly updated by MyClaw.ai — Powered by MyClaw.ai

2026-06-012.0k

1m-trade

LeoYeAI/openclaw-master-skills

Integrated on-chain operations hub: integrates BlockBeats market intelligence, Hyperliquid DEX trading via `hl1m`, wallet creation and management at https://www.1m-trade.com, and supports local initialization using `hl1m init-wallet` (wallet address + proxy private key, never use the main wallet private key). Supports fully autonomous AI trading.

2026-06-012.0k

1m-trade-dex

LeoYeAI/openclaw-master-skills

Hyperliquid DEX/Perps entrypoint via `hl1m`: market queries, order placement. Wallet creation/management at https://www.1m-trade.com; local `hl1m init-wallet` with address + proxy (API) private key — never the main wallet key. No in-skill private-key messaging.

2026-06-012.0k

1m-trade-news

LeoYeAI/openclaw-master-skills

Querying crypto news, newsflashes, articles, search, and on-chain market data (ETF flows, stablecoin supply, derivatives OI, M2, DXY, Bitfinex long positions, and more). Requires BLOCKBEATS_API_KEY.

2026-06-012.0k

1m-trade-wallet

LeoYeAI/openclaw-master-skills

Create EVM wallets, automate funding/bridging to Hyperliquid L1, and activate accounts (auto swap, bridging, and L1 activation).

2026-06-012.0k

aade-api-monitor

LeoYeAI/openclaw-master-skills

Real-time monitoring of Greek AADE tax authority systems — tracks deadlines, rate changes, and compliance updates. File-based, OpenClaw-native.

2026-06-012.0k

Source

LeoYeAI

LeoYeAI/openclaw-master-skills

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Computer Occupations, All OtherComputer and Mathematical Occupations15-1299L4

Software DevelopersL4 Web DevelopersL4

name

clean-content-fetch

description

Scrapling Web Fetch

当用户要获取网页内容、正文提取、把网页转成 markdown/text、抓取文章主体时，优先使用此技能。

默认流程

使用 python3 scripts/scrapling_fetch.py <url> <max_chars>
默认正文选择器优先级：
- article
- main
- .post-content
- [class*="body"]
命中正文后，使用 html2text 转 Markdown
若都未命中，回退到 body
最终按 max_chars 截断输出

用法

python3 /Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/scrapling_fetch.py <url> 30000

依赖

优先检查：

scrapling
html2text
curl_cffi
playwright
browserforge

推荐使用独立虚拟环境，避免系统 Python 的 PEP 668 限制：

python3 -m venv /Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch
/Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch/bin/pip install scrapling html2text curl_cffi playwright browserforge
/Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch/bin/python -m playwright install chromium

如直接运行脚本，优先使用该虚拟环境中的 Python：

/Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch/bin/python /Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/scrapling_fetch.py <url> 30000

输出约定

脚本默认输出 Markdown 正文内容。如需结构化输出，可追加 --json。如需调试提取命中了哪个 selector，可查看 stderr 输出。

附加资源

用法参考：/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/references/usage.md
选择器策略：/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/references/selectors.md
统一入口：/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/fetch-web-content

何时用这个技能

获取文章正文
抓博客/新闻/公告正文
将网页转成 Markdown 供后续总结
常规 fetch 效果差，希望提升现代网页抓取稳定性
抓小红书分享短链或笔记落地页正文

小红书抓取方法

对于 xhslink.com 短链或小红书笔记页，推荐直接使用虚拟环境中的脚本运行：

/Users/zzd/.openclaw/workspace/.venvs/clean-content-fetch/bin/python /Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/scrapling_fetch.py 'http://xhslink.com/o/9745hugimlD' 30000

说明：

脚本会先解析短链并抓取落地页正文
适合提取小红书笔记文案、标题和主体内容
若页面需要更复杂交互，再切到浏览器自动化

何时不用

需要完整浏览器交互、点击、登录、翻页时：改用浏览器自动化
只是简单获取 API JSON：直接请求 API 更合适