with one click
defuddle
// Strip clutter from web pages before ingesting. Removes ads, navigation, and boilerplate — leaving clean markdown that saves 40-60% tokens.
// Strip clutter from web pages before ingesting. Removes ads, navigation, and boilerplate — leaving clean markdown that saves 40-60% tokens.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | defuddle |
| description | Strip ads, nav, boilerplate from web pages. Saves 40-60% tokens. Use before URL ingestion. |
| allowed-tools | Read Bash |
Extract meaningful content from web pages: drop ads, nav, cookie banners, footers, related articles. Optional but recommended (saves 40-60% tokens, cleaner wiki pages).
npm install -g defuddle-cli
Verify: defuddle --version
defuddle https://example.com/article
Outputs clean markdown to stdout.
defuddle https://example.com/article > .raw/articles/article-slug-$(date +%Y-%m-%d).md
After running defuddle, prepend the source URL and fetch date:
SLUG="article-slug-$(date +%Y-%m-%d)"
{ echo "---"; echo "source_url: https://example.com/article"; echo "fetched: $(date +%Y-%m-%d)"; echo "---"; echo ""; defuddle https://example.com/article; } > .raw/articles/$SLUG.md
defuddle page.html
Use: articles/blogs/docs from URLs with surrounding content, long articles on token budget. Skip: clean markdown/PDF, dashboards/apps/structured data, defuddle not installed + short article.
If not installed: use WebFetch directly. Content is less clean but workable.
/ingest calls defuddle automatically if available when given a URL. Manual path: run save command above, then ingest .raw/articles/[slug].md.