ワンクリックで
data-describe
// Generate AI-powered Data Dictionary, Description, and Tags for a CSV/TSV/Excel file
// Generate AI-powered Data Dictionary, Description, and Tags for a CSV/TSV/Excel file
Standard workflow order, tool selection matrix, and composition patterns for qsv CSV data wrangling
Respond to all pending review comments on the current PR — fetch comments, apply fixes, verify accuracy, test, commit, and reply. Use when addressing Copilot reviews, GitHub PR reviews, or any batch of review feedback.
Prepare an MCP server and plugin release by bumping versions across all files and updating changelog
Run SQL queries against CSV/TSV/Excel files using Polars SQL engine
Clean a CSV/TSV/Excel file - fix headers, trim whitespace, remove duplicates, validate
Convert between CSV, TSV, Excel, JSONL, Parquet, and other tabular formats
| name | data-describe |
| description | Generate AI-powered Data Dictionary, Description, and Tags for a CSV/TSV/Excel file |
| user-invocable | true |
| argument-hint | <file> [--dictionary|--description|--tags|--all] |
| allowed-tools | ["mcp__qsv__qsv_sniff","mcp__qsv__qsv_count","mcp__qsv__qsv_headers","mcp__qsv__qsv_index","mcp__qsv__qsv_stats","mcp__qsv__qsv_describegpt","mcp__qsv__qsv_list_files","mcp__qsv__qsv_get_working_dir","mcp__qsv__qsv_set_working_dir"] |
Generate AI-powered documentation for a tabular data file using describegpt. Produces a Data Dictionary (column labels, descriptions, types), a natural-language Description of the dataset, and semantic Tags — all via the connected LLM (no API key needed in MCP mode).
Cowork note: If relative paths don't resolve, call
mcp__qsv__qsv_get_working_dirandmcp__qsv__qsv_set_working_dirto sync the working directory.
Index: Run mcp__qsv__qsv_index on the file for fast random access.
Profile: Run mcp__qsv__qsv_stats with cardinality: true, stats_jsonl: true to generate the stats cache. describegpt reads this cache for column metadata, so it must exist first.
Describe: Run mcp__qsv__qsv_describegpt with the requested options (recommend all: true for comprehensive output). At least one inference option (dictionary, description, tags, or all) is required. Output defaults to <filestem>.describegpt.md.
Present: Display the generated Data Dictionary table, Description, and Tags to the user.
| Option | Effect |
|---|---|
--all (recommended) | Generate Dictionary + Description + Tags in one pass |
--dictionary | Data Dictionary only — column labels, descriptions, types |
--description | Natural-language dataset Description only |
--tags | Semantic Tags only |
--format | Output format: Markdown (default), JSON, TSV, TOON |
--language | Generate output in a non-English language (e.g. Spanish, French) |
--addl-cols-list | Enrich the dictionary with extra columns (e.g. "everything", "moar!") |
--tag-vocab | Constrain tags to a controlled vocabulary (comma-separated) |
--num-tags | Number of tags to generate (default: 5) |
--num-examples | Number of example values per column in the dictionary |
--enum-threshold | Max cardinality to treat a column as an enum in the dictionary |
<filestem>.describegpt.md--format JSON when you need machine-readable output for downstream processing--language to generate documentation in the user's preferred language