一键在 Manus 中运行任何 Skill

$pwd:

chdb-sql

Name: Chdb Sql
Author: chdb-io

// Use when the user wants to run SQL — especially analytical SQL — on local files (parquet/csv/json), URLs, S3 paths, or remote databases (Postgres, MySQL, MongoDB, ClickHouse Cloud, Iceberg, Delta Lake) without setting up a server. Provides chDB — embedded ClickHouse SQL in Python with 1000+ functions, Session for stateful multi-step pipelines, parametrized queries, and cross-source joins via `s3()`, `mysql()`, `postgresql()`, `iceberg()`, `deltaLake()`, `remoteSecure()` table functions. TRIGGER when: user wants SQL on parquet/csv/files or across remote analytical sources; uses ClickHouse SQL features (window functions, windowFunnel, geoToH3, JSON path ops, Session, parametrized queries); imports `chdb` or calls `chdb.query()`. SKIP this skill for pandas-style DataFrame method-chaining (use chdb-datastore instead) or ClickHouse server administration.

在 Manus 中运行

$ git log --oneline --stat

stars:2,690

forks:113

updated:2026年5月19日 01:38

文件资源管理器

6 个文件

SKILL.md

readonly

name	chdb-sql
description	Use when the user wants to run SQL — especially analytical SQL — on local files (parquet/csv/json), URLs, S3 paths, or remote databases (Postgres, MySQL, MongoDB, ClickHouse Cloud, Iceberg, Delta Lake) without setting up a server. Provides chDB — embedded ClickHouse SQL in Python with 1000+ functions, Session for stateful multi-step pipelines, parametrized queries, and cross-source joins via `s3()`, `mysql()`, `postgresql()`, `iceberg()`, `deltaLake()`, `remoteSecure()` table functions. TRIGGER when: user wants SQL on parquet/csv/files or across remote analytical sources; uses ClickHouse SQL features (window functions, windowFunnel, geoToH3, JSON path ops, Session, parametrized queries); imports `chdb` or calls `chdb.query()`. SKIP this skill for pandas-style DataFrame method-chaining (use chdb-datastore instead) or ClickHouse server administration.
license	Apache-2.0
compatibility	Requires Python 3.9+, macOS or Linux. pip install chdb.
metadata	{"author":"chdb-io","version":"4.1","homepage":"https://clickhouse.com/docs/chdb"}

chdb SQL — ClickHouse in Your Python Process

Run ClickHouse SQL directly in Python — no server needed. Query local files, remote databases, and cloud storage with full ClickHouse SQL power.

pip install chdb

Decision Tree: Pick the Right API

1. One-off query on files or databases → chdb.query()
2. Multi-step analysis with tables      → Session
3. DB-API 2.0 connection                → chdb.connect()
4. Pandas-style DataFrame operations    → Use chdb-datastore skill instead

chdb.query() — One Line, Any Data

import chdb

chdb.query("SELECT * FROM file('data.parquet', Parquet) WHERE price > 100 LIMIT 10")       # local files
chdb.query("SELECT * FROM mysql('db:3306', 'shop', 'orders', 'root', 'pass')")              # databases
chdb.query("SELECT * FROM s3('s3://bucket/data.parquet', NOSIGN) LIMIT 10")                 # cloud storage
chdb.query("SELECT * FROM deltaLake('s3://bucket/delta/table', NOSIGN) LIMIT 10")           # data lakes

# Cross-source join
chdb.query("""
    SELECT u.name, o.amount FROM mysql('db:3306', 'crm', 'users', 'root', 'pass') AS u
    JOIN file('orders.parquet', Parquet) AS o ON u.id = o.user_id ORDER BY o.amount DESC
""")

data = {"name": ["Alice", "Bob"], "score": [95, 87]}
chdb.query("SELECT * FROM Python(data) ORDER BY score DESC")                                # Python data
df = chdb.query("SELECT * FROM numbers(10)", "DataFrame")                                   # output formats
chdb.query("SELECT toDate({d:String}) + number FROM numbers({n:UInt64})",
    "DataFrame", params={"d": "2025-01-01", "n": 30})                                      # parametrized

Table functions → table-functions.md | SQL functions → sql-functions.md | Full API → api-reference.md

Session — Stateful Analysis Pipelines

from chdb import session as chs
sess = chs.Session("./analytics_db")   # persistent; Session() for in-memory

sess.query("CREATE TABLE users ENGINE=MergeTree() ORDER BY id AS SELECT * FROM mysql('db:3306','crm','users','root','pass')")
sess.query("CREATE TABLE events ENGINE=MergeTree() ORDER BY (ts,user_id) AS SELECT * FROM s3('s3://logs/events/*.parquet',NOSIGN)")
sess.query("""
    SELECT u.country, count() AS cnt, uniqExact(e.user_id) AS users
    FROM events e JOIN users u ON e.user_id = u.id
    WHERE e.ts >= today() - 7 GROUP BY u.country ORDER BY cnt DESC
""", "Pretty").show()
sess.close()

Connection API (DB-API 2.0)

from chdb import dbapi
conn = dbapi.connect()
cur = conn.cursor()
cur.execute("SELECT * FROM file('data.parquet', Parquet) WHERE value > 100")
print(cur.fetchall())
cur.close()
conn.close()

Troubleshooting

Problem	Fix
`ImportError: No module named 'chdb'`	`pip install chdb`
`DB::Exception: FILE_NOT_FOUND`	Check file path; use absolute path or verify cwd
`DB::Exception: Unknown table function`	Check function name spelling (e.g., `deltaLake` not `deltalake`)
Connection refused to remote DB	Check host:port format; ensure remote DB allows connections
Environment check	Run `python agent/skills/chdb-sql/scripts/verify_install.py`

References

API Reference — query/Session/connect signatures
Table Functions — All ClickHouse table functions
SQL Functions — Commonly used SQL functions
Examples — 9 runnable examples with expected output
Official Docs

Note: This skill teaches how to use chdb SQL. For pandas-style operations, use the chdb-datastore skill. For contributing to chdb source code, see CLAUDE.md in the project root.

related-skills.json

同仓库

chdb-datastore.md

from "chdb-io/chdb"

Use when the user has tabular data (pandas DataFrame, parquet, csv, Arrow, json) and wants to filter, group, aggregate, join, or speed up slow pandas. Provides chDB DataStore — same pandas API, ClickHouse engine underneath. Also handles reading from S3, MySQL, PostgreSQL, MongoDB, ClickHouse Cloud, Iceberg, Delta Lake as DataFrames and joining across sources. TRIGGER when: user mentions DataFrame, parquet, csv, "fast pandas", "speed up pandas", or cross-source DataFrame joins; user imports `chdb.datastore` or `from datastore import DataStore`. SKIP this skill for raw SQL syntax (use chdb-sql instead), ClickHouse server administration, or non-Python DataStore API work.

2026-05-192.7k

package.json

"author": "chdb-io"

"repository": "chdb-io/chdb"

打开 GitHub 仓库查看创作者相关仓库

$ install --global

$ download --local

在 Manus 中运行

$ useful --forSOC

数据科学家计算机与数学类职业15-2051L4

name	chdb-sql
description	Use when the user wants to run SQL — especially analytical SQL — on local files (parquet/csv/json), URLs, S3 paths, or remote databases (Postgres, MySQL, MongoDB, ClickHouse Cloud, Iceberg, Delta Lake) without setting up a server. Provides chDB — embedded ClickHouse SQL in Python with 1000+ functions, Session for stateful multi-step pipelines, parametrized queries, and cross-source joins via `s3()`, `mysql()`, `postgresql()`, `iceberg()`, `deltaLake()`, `remoteSecure()` table functions. TRIGGER when: user wants SQL on parquet/csv/files or across remote analytical sources; uses ClickHouse SQL features (window functions, windowFunnel, geoToH3, JSON path ops, Session, parametrized queries); imports `chdb` or calls `chdb.query()`. SKIP this skill for pandas-style DataFrame method-chaining (use chdb-datastore instead) or ClickHouse server administration.
license	Apache-2.0
compatibility	Requires Python 3.9+, macOS or Linux. pip install chdb.
metadata	{"author":"chdb-io","version":"4.1","homepage":"https://clickhouse.com/docs/chdb"}

chdb SQL — ClickHouse in Your Python Process

Run ClickHouse SQL directly in Python — no server needed. Query local files, remote databases, and cloud storage with full ClickHouse SQL power.

pip install chdb

Decision Tree: Pick the Right API

1. One-off query on files or databases → chdb.query()
2. Multi-step analysis with tables      → Session
3. DB-API 2.0 connection                → chdb.connect()
4. Pandas-style DataFrame operations    → Use chdb-datastore skill instead

chdb.query() — One Line, Any Data

import chdb

chdb.query("SELECT * FROM file('data.parquet', Parquet) WHERE price > 100 LIMIT 10")       # local files
chdb.query("SELECT * FROM mysql('db:3306', 'shop', 'orders', 'root', 'pass')")              # databases
chdb.query("SELECT * FROM s3('s3://bucket/data.parquet', NOSIGN) LIMIT 10")                 # cloud storage
chdb.query("SELECT * FROM deltaLake('s3://bucket/delta/table', NOSIGN) LIMIT 10")           # data lakes

# Cross-source join
chdb.query("""
    SELECT u.name, o.amount FROM mysql('db:3306', 'crm', 'users', 'root', 'pass') AS u
    JOIN file('orders.parquet', Parquet) AS o ON u.id = o.user_id ORDER BY o.amount DESC
""")

data = {"name": ["Alice", "Bob"], "score": [95, 87]}
chdb.query("SELECT * FROM Python(data) ORDER BY score DESC")                                # Python data
df = chdb.query("SELECT * FROM numbers(10)", "DataFrame")                                   # output formats
chdb.query("SELECT toDate({d:String}) + number FROM numbers({n:UInt64})",
    "DataFrame", params={"d": "2025-01-01", "n": 30})                                      # parametrized

Table functions → table-functions.md | SQL functions → sql-functions.md | Full API → api-reference.md

Session — Stateful Analysis Pipelines

from chdb import session as chs
sess = chs.Session("./analytics_db")   # persistent; Session() for in-memory

sess.query("CREATE TABLE users ENGINE=MergeTree() ORDER BY id AS SELECT * FROM mysql('db:3306','crm','users','root','pass')")
sess.query("CREATE TABLE events ENGINE=MergeTree() ORDER BY (ts,user_id) AS SELECT * FROM s3('s3://logs/events/*.parquet',NOSIGN)")
sess.query("""
    SELECT u.country, count() AS cnt, uniqExact(e.user_id) AS users
    FROM events e JOIN users u ON e.user_id = u.id
    WHERE e.ts >= today() - 7 GROUP BY u.country ORDER BY cnt DESC
""", "Pretty").show()
sess.close()

Connection API (DB-API 2.0)

from chdb import dbapi
conn = dbapi.connect()
cur = conn.cursor()
cur.execute("SELECT * FROM file('data.parquet', Parquet) WHERE value > 100")
print(cur.fetchall())
cur.close()
conn.close()

Troubleshooting

Problem	Fix
`ImportError: No module named 'chdb'`	`pip install chdb`
`DB::Exception: FILE_NOT_FOUND`	Check file path; use absolute path or verify cwd
`DB::Exception: Unknown table function`	Check function name spelling (e.g., `deltaLake` not `deltalake`)
Connection refused to remote DB	Check host:port format; ensure remote DB allows connections
Environment check	Run `python agent/skills/chdb-sql/scripts/verify_install.py`

References

API Reference — query/Session/connect signatures
Table Functions — All ClickHouse table functions
SQL Functions — Commonly used SQL functions
Examples — 9 runnable examples with expected output
Official Docs

Note: This skill teaches how to use chdb SQL. For pandas-style operations, use the chdb-datastore skill. For contributing to chdb source code, see CLAUDE.md in the project root.

chdb-sql

chdb SQL — ClickHouse in Your Python Process

Decision Tree: Pick the Right API

chdb.query() — One Line, Any Data

Session — Stateful Analysis Pipelines

Connection API (DB-API 2.0)

Troubleshooting

References

同仓库更多 Skills

同仓库更多 Skills

chdb SQL — ClickHouse in Your Python Process

Decision Tree: Pick the Right API

chdb.query() — One Line, Any Data

Session — Stateful Analysis Pipelines

Connection API (DB-API 2.0)

Troubleshooting

References