| name | fetching-and-converting-urls |
| description | Use when fetching a live URL and converting it to Markdown. Covers --url, custom user agents, preprocessing for noisy pages, and the --json ConversionResult shape. |
Fetching and converting URLs
Use this when the user gives a URL instead of HTML and wants the page as
Markdown (or its metadata/tables). The CLI fetches the page over HTTP and
converts it in one step via --url. --url conflicts with a positional FILE.
For crawling many pages or following links, use crawlberg instead — this
skill is for a single URL.
Fetch and convert
html-to-markdown --url https://example.com
html-to-markdown --url https://example.com -o page.md
html-to-markdown --url https://example.com --user-agent "MyBot/1.0"
--user-agent requires --url.
Clean noisy pages
Real web pages carry navigation, ads, cookie banners, and forms. Preprocess
before converting:
html-to-markdown --url https://example.com/article --preprocess --preset aggressive
html-to-markdown --url https://example.com --preprocess --keep-navigation
Presets: minimal, standard (default), aggressive. --preset and the
--keep-* flags require --preprocess.
JSON output (ConversionResult)
Add --json to get the full structured result instead of plain Markdown:
html-to-markdown --url https://example.com --json
{
"content": "# Title\n\nContent\n",
"metadata": {
"document": { "title": "...", "language": "en" },
"headers": [],
"links": [],
"images": [],
"structured_data": []
},
"tables": [],
"images": [],
"warnings": []
}
Useful combinations (all require --json):
html-to-markdown --url https://example.com --json --no-content \
| jq '{title: .metadata.document.title, headings: [.metadata.headers[].text]}'
html-to-markdown --url https://example.com --json --include-structure | jq '.document'
html-to-markdown --url https://example.com --json --extract-inline-images | jq '.images | length'
Surface warnings
html-to-markdown --url https://example.com --show-warnings > page.md
Exit codes
| Code | Meaning |
|---|
| 0 | Success |
| 1 | Conversion or I/O error (including a failed fetch) |
| 2 | Invalid arguments |
See ../html-to-markdown/references/cli-reference.md for the full flag set and
JSON shape.