| name | local-clickhouse |
| description | Install, configure, and validate local ClickHouse for gapless-crypto-clickhouse development and backtesting. Use when setting up local development environment, enabling offline mode, improving query performance for backtesting, or running E2E validation. Includes mise/Homebrew/apt installation, mode detection, connection validation, and E2E workflow scripts. |
Local ClickHouse
Install, configure, and validate local ClickHouse as an alternative to ClickHouse Cloud for development and backtesting (ADR-0044, ADR-0045).
Purpose
Enable local ClickHouse deployment for:
- Backtesting: 50-100x faster queries (no network round-trip)
- Development: Offline work without Cloud credentials
- Evaluation: Try package before Cloud commitment
- Cost savings: Avoid Cloud costs during experimentation
When to Use
Use this skill when:
- Local development: Setting up development environment without Cloud
- Backtesting optimization: Need faster query performance
- Offline mode: Working without network access
- Package evaluation: Trying gapless-crypto-clickhouse before Cloud setup
Triggers: User mentions "local ClickHouse", "install clickhouse", "backtesting setup", "offline mode", "GCCH_MODE=local"
Prerequisites
System Requirements:
- macOS (Homebrew) or Linux (apt/installer)
- 2-4GB RAM minimum
- 10GB+ disk space for data
Network: Internet required for installation only (offline after setup)
Workflow
Step 1: Install ClickHouse
macOS (Homebrew):
brew install clickhouse
clickhouse --version
Linux (Ubuntu/Debian):
curl https://clickhouse.com/ | sh
./clickhouse --version
sudo apt-get install -y apt-transport-https ca-certificates
curl -fsSL https://packages.clickhouse.com/deb/lts/clickhouse.gpg | sudo gpg --dearmor -o /usr/share/keyrings/clickhouse-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client
Step 2: Start ClickHouse Server
macOS (Homebrew):
clickhouse server --daemon
clickhouse client --query "SELECT 1"
Linux (systemd):
sudo service clickhouse-server start
sudo systemctl enable clickhouse-server
clickhouse-client --query "SELECT 1"
Step 3: Configure Environment
Set Local Mode:
export GCCH_MODE=local
export CLICKHOUSE_HOST=localhost
Environment Variables:
export GCCH_MODE=local
export CLICKHOUSE_HOST=localhost
export CLICKHOUSE_HTTP_PORT=8123
export CLICKHOUSE_DATABASE=default
export CLICKHOUSE_USER=default
export CLICKHOUSE_PASSWORD=
Step 4: Verify Connection
Using Python:
from gapless_crypto_clickhouse import probe
status = probe.check_local_clickhouse()
print(f"Installed: {status['installed']}")
print(f"Running: {status['running']}")
print(f"Version: {status['version']}")
mode = probe.get_current_mode()
print(f"Mode: {mode}")
Using CLI:
GCCH_MODE=local python -c "
from gapless_crypto_clickhouse.clickhouse.config import ClickHouseConfig
config = ClickHouseConfig.from_env()
print(f'Host: {config.host}')
print(f'Port: {config.http_port}')
print(f'Secure: {config.secure}')
"
Step 5: Test with Real Data
Query with Auto-Ingestion:
import os
os.environ["GCCH_MODE"] = "local"
from gapless_crypto_clickhouse import query_ohlcv
df = query_ohlcv(
"BTCUSDT",
"1h",
"2024-01-01",
"2024-01-07"
)
print(f"Rows: {len(df)}")
print(df.head())
Mode Detection Logic
The package auto-detects deployment mode based on environment:
GCCH_MODE=local → Local mode (explicit)
GCCH_MODE=cloud → Cloud mode (requires CLICKHOUSE_HOST)
GCCH_MODE=auto → Auto-detect:
CLICKHOUSE_HOST="" → Local mode
CLICKHOUSE_HOST="localhost" → Local mode
CLICKHOUSE_HOST="127.0.0.1" → Local mode
CLICKHOUSE_HOST="*.cloud" → Cloud mode
Introspection:
from gapless_crypto_clickhouse import probe
modes = probe.get_deployment_modes()
print(modes["available_modes"])
current = probe.get_current_mode()
guide = probe.get_local_installation_guide()
print(guide["macos"]["commands"])
Success Criteria
- ClickHouse server running on localhost:8123
probe.check_local_clickhouse() returns {"installed": True, "running": True}
probe.get_current_mode() returns "local"
query_ohlcv() executes without Cloud credentials
Troubleshooting
Issue: "Connection refused" on port 8123
-
Check: Is ClickHouse server running?
ps aux | grep clickhouse
sudo service clickhouse-server status
-
Action: Start server with clickhouse server --daemon
Issue: "Command not found: clickhouse"
- Check: Was ClickHouse installed correctly?
- macOS: Run
brew install clickhouse
- Linux: Run
curl https://clickhouse.com/ | sh
Issue: Mode detected as "cloud" instead of "local"
- Check: Is
CLICKHOUSE_HOST set to a remote hostname?
- Action: Set
export GCCH_MODE=local explicitly
Issue: Server crashes with memory error
- Check: System has 2-4GB RAM available
- Action: Increase available memory or reduce concurrent queries
Issue: Permission denied errors
- Check: ClickHouse data directory permissions
- Action: Ensure user has write access to
/var/lib/clickhouse (Linux)
Port Reference
| Mode | HTTP Port | Native Port | Secure |
|---|
| Local | 8123 | 9000 | False |
| Cloud | 8443 | 9440 | True |
Performance Comparison
| Metric | Local | Cloud |
|---|
| Query latency | 50ms | 100-500ms |
| First query (cold) | 50ms | 5-10s (idle resume) |
| Network dependency | None | Required |
| Credentials | None | Required |
| Best for | Backtesting | Production |
Server Management
Start Server:
clickhouse server --daemon
sudo service clickhouse-server start
Stop Server:
pkill -f clickhouse-server
sudo service clickhouse-server stop
Check Logs:
tail -f /opt/homebrew/var/log/clickhouse-server/clickhouse-server.log
sudo tail -f /var/log/clickhouse-server/clickhouse-server.log
E2E Validation Workflow (ADR-0045)
This skill includes executable scripts for end-to-end validation of local ClickHouse.
Scripts Directory
| Script | Purpose |
|---|
scripts/start-clickhouse.sh | Start mise-installed ClickHouse server |
scripts/deploy-schema.sh | Deploy production schema (calls existing script) |
scripts/ingest-sample-data.py | Ingest real Binance data via query_ohlcv() |
scripts/take-screenshot.py | Capture Play UI screenshot via Playwright |
scripts/validate-data.py | Validate data integrity, output JSON evidence |
Running E2E Validation
Full workflow (manual):
./skills/local-clickhouse/scripts/start-clickhouse.sh
./skills/local-clickhouse/scripts/deploy-schema.sh
uv run python skills/local-clickhouse/scripts/ingest-sample-data.py
uv run python skills/local-clickhouse/scripts/take-screenshot.py
uv run python skills/local-clickhouse/scripts/validate-data.py
Via pytest (automated):
uv run pytest tests/test_local_clickhouse_e2e.py -v
Evidence Output
Scripts output evidence to tests/screenshots/ (gitignored):
play-ui-{timestamp}.png - Playwright screenshots
validation-{timestamp}.json - Structured validation results
References
Next Steps
After successful local setup:
- Schema Creation: Tables created automatically on first query
- Data Ingestion: Use
query_ohlcv() with auto_ingest=True
- E2E Validation: Run
uv run pytest tests/test_local_clickhouse_e2e.py -v
- Backtesting: Run analysis with 50-100x faster queries
- Production Migration: Switch to Cloud mode with
export GCCH_MODE=cloud