| name | extracting-tables |
| description | Use when extracting tabular data from HTML. Covers GFM Markdown tables, the structured tables array (grid cells plus pre-rendered markdown), and <br> handling in table cells. |
Extracting tables
Use this when the user wants tabular data out of HTML — pricing tables, data
grids, spec sheets. html-to-markdown parses <table> elements into two
surfaces at once: inline GFM Markdown tables in the content stream, and a
structured tables array in the JSON output.
Two surfaces
html-to-markdown input.html
html-to-markdown --json input.html | jq '.tables'
- Markdown tables in
content — | col | col | blocks, good for LLM
ingestion and human reading.
- Structured
tables array — each entry has a markdown field
(pre-rendered) and a grid of structured cells (rows × cols). Use this when
downstream code needs exact cell access.
Both are populated from the same parse; you do not need a flag to enable table
parsing.
Extraction-only
When you only care about tables (not the Markdown body):
html-to-markdown --json --no-content input.html | jq '.tables'
Inspecting tables
html-to-markdown --json input.html \
| jq '.tables | to_entries | map({index: .key, rows: (.value.grid | length)})'
html-to-markdown --json input.html | jq -r '.tables[0].markdown'
Line breaks in cells
By default <br> inside a cell is converted to a space. Keep hard breaks:
html-to-markdown input.html --br-in-tables
Programmatic access
from html_to_markdown import convert
result = convert(html)
for table in result.tables:
print(table.markdown)
print(table.grid[0][0])
import { convert } from "@kreuzberg/html-to-markdown";
const result = JSON.parse(convert(html));
for (const table of result.tables) {
console.log(table.markdown);
}
Notes and limits
- Nested tables are flattened — the inner table is rendered inline within
the parent cell.
- Merged / spanning cells are reconstructed positionally; the span itself
is not preserved as separate metadata.
- Malformed tables still parse on a best-effort basis; check
result.warnings
(or --show-warnings) for non-fatal issues.
See ../html-to-markdown/references/cli-reference.md (Tables section) for the
full flag set.