con un clic
read-file
// Read any data file (CSV, JSON, Parquet, Avro, Excel, spatial, SQLite) or remote URL (S3, HTTPS). Use when user references a data file, asks "what's in this file", or wants to preview/profile a dataset. Not for source code.
// Read any data file (CSV, JSON, Parquet, Avro, Excel, spatial, SQLite) or remote URL (S3, HTTPS). Use when user references a data file, asks "what's in this file", or wants to preview/profile a dataset. Not for source code.
Explore and query data on S3, Cloudflare R2, GCS, MinIO, or any S3-compatible storage. Use when the user mentions an s3://, r2://, gs://, or gcs:// URL, asks "what's in this bucket", wants to list remote files, preview remote Parquet/CSV/JSON, or query data on object storage without downloading it. Also triggers when the user wants to know the size, schema, or row count of remote datasets.
Answer questions about spatial data using DuckDB. Use when the user mentions locations, coordinates, lat/lng, distances, maps, addresses, "near", "within", "closest", geographic names, or spatial file formats (GeoJSON, Shapefile, GeoPackage, GPX, GeoParquet). Also triggers when the user wants to find places, buildings, or roads — Overture Maps provides free global data on S3 with zero API keys. Handles spatial joins, distance calculations, containment checks, density analysis, and format conversions for geographic data.
Convert any data file to another format: CSV, Parquet, JSON, Excel, GeoJSON, and more. Use when the user says "convert to parquet", "save as xlsx", "export as JSON", "make this a CSV", "turn into parquet", or any variation of format-to-format conversion for data files. Also triggers when the user wants to write Parquet, Excel, or other binary formats that Claude cannot produce natively.
Search DuckDB and DuckLake documentation and blog posts. Returns relevant doc chunks for a question or keyword using full-text search against a locally cached index.
Search past Claude Code session logs to recall prior decisions, patterns, or unresolved work. Use when user says "do you remember", "what did we do", references past conversations, or you need context from prior sessions.
Install or update DuckDB extensions. Each argument is either a plain extension name (installs from core) or name@repo (e.g. magic@community). Pass --update to update extensions instead of installing.
| name | read-file |
| description | Read any data file (CSV, JSON, Parquet, Avro, Excel, spatial, SQLite) or remote URL (S3, HTTPS). Use when user references a data file, asks "what's in this file", or wants to preview/profile a dataset. Not for source code. |
| argument-hint | <filename or URL> [question about the data] |
| allowed-tools | Bash |
You are helping the user read and analyze a data file using DuckDB.
Filename given: $0
Question: ${1:-describe the data}
RESOLVED_PATH is $0. If the user gave a bare filename (no /), resolve it to a full path with find first.
Run a single DuckDB command that defines the read_any macro inline and reads the file.
For remote files, prepend the necessary LOAD/SECRET before the macro:
| Protocol | Prepend |
|---|---|
https:// / http:// | LOAD httpfs; |
s3:// | LOAD httpfs; CREATE SECRET (TYPE S3, PROVIDER credential_chain); |
gs:// / gcs:// | LOAD httpfs; CREATE SECRET (TYPE GCS, PROVIDER credential_chain); |
az:// / azure:// / abfss:// | LOAD httpfs; LOAD azure; CREATE SECRET (TYPE AZURE, PROVIDER credential_chain); |
For local files, no prefix needed.
duckdb -csv -c "
CREATE OR REPLACE MACRO read_any(file_name) AS TABLE
WITH json_case AS (FROM read_json_auto(file_name))
, csv_case AS (FROM read_csv(file_name))
, parquet_case AS (FROM read_parquet(file_name))
, avro_case AS (FROM read_avro(file_name))
, blob_case AS (FROM read_blob(file_name))
, spatial_case AS (FROM st_read(file_name))
, excel_case AS (FROM read_xlsx(file_name))
, sqlite_case AS (FROM sqlite_scan(file_name, (SELECT name FROM sqlite_master(file_name) LIMIT 1)))
, ipynb_case AS (
WITH nb AS (FROM read_json_auto(file_name))
SELECT cell_idx, cell.cell_type,
array_to_string(cell.source, '') AS source,
cell.execution_count
FROM nb, UNNEST(cells) WITH ORDINALITY AS t(cell, cell_idx)
ORDER BY cell_idx
)
FROM query_table(
CASE
WHEN file_name ILIKE '%.json' OR file_name ILIKE '%.jsonl' OR file_name ILIKE '%.ndjson' OR file_name ILIKE '%.geojson' OR file_name ILIKE '%.geojsonl' OR file_name ILIKE '%.har' THEN 'json_case'
WHEN file_name ILIKE '%.csv' OR file_name ILIKE '%.tsv' OR file_name ILIKE '%.tab' OR file_name ILIKE '%.txt' THEN 'csv_case'
WHEN file_name ILIKE '%.parquet' OR file_name ILIKE '%.pq' THEN 'parquet_case'
WHEN file_name ILIKE '%.avro' THEN 'avro_case'
WHEN file_name ILIKE '%.xlsx' OR file_name ILIKE '%.xls' THEN 'excel_case'
WHEN file_name ILIKE '%.shp' OR file_name ILIKE '%.gpkg' OR file_name ILIKE '%.fgb' OR file_name ILIKE '%.kml' THEN 'spatial_case'
WHEN file_name ILIKE '%.ipynb' THEN 'ipynb_case'
WHEN file_name ILIKE '%.db' OR file_name ILIKE '%.sqlite' OR file_name ILIKE '%.sqlite3' THEN 'sqlite_case'
ELSE 'blob_case'
END
);
DESCRIBE FROM read_any('RESOLVED_PATH');
SELECT count(*) AS row_count FROM read_any('RESOLVED_PATH');
FROM read_any('RESOLVED_PATH') LIMIT 20;
"
If this fails:
duckdb: command not found → invoke /duckdb-skills:install-duckdb and retry.INSTALL spatial; LOAD spatial; or INSTALL sqlite_scanner; LOAD sqlite_scanner; prepended before the macro.read_* function directly instead of read_any.Using the schema, row count, and sample rows, answer:
${1:-describe the data: summarize column types, row count, and any notable patterns.}