| name | ktx-ai-data-agents-mcp-context-skills |
| description | Executable context layer for data and analytics agents - enables Claude Code, Codex, and AI agents to query data accurately through MCP with skills, memory and a semantic layer |
| triggers | ["set up ktx for data agent queries","configure ktx semantic layer","integrate ktx with my warehouse","use ktx to query analytics data","build ktx context from dbt and wiki","enable agent access to warehouse with ktx","install ktx for AI data analysis","connect ktx to my database"] |
ktx AI Data Agents MCP Context Skills
Skill by ara.so — AI Agent Skills collection.
ktx is a self-improving context layer that teaches AI agents how to query your data warehouse accurately. It combines approved metric definitions, joinable columns, wiki knowledge, and semantic layers into a single searchable surface exposed through MCP and CLI tools.
What ktx Does
- Learns from company knowledge: Ingests wiki content, dbt docs, Looker/Metabase metadata
- Maps the data stack: Samples tables, detects joinable columns, annotates sources
- Builds a semantic layer: Combines raw tables and high-level metrics with automatic join resolution
- Serves agents: Exposes CLI and MCP tools with full-text and semantic search
Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, SQLite. Integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion.
Installation
npm install -g @kaelio/ktx
npx @kaelio/ktx --version
For local development:
git clone https://github.com/kaelio/ktx.git
cd ktx
pnpm install
uv sync --all-groups
pnpm run build
Initial Setup
ktx setup
ktx status
The setup wizard will:
- Create
ktx.yaml configuration
- Configure LLM provider (Anthropic, Google Vertex, AI Gateway, or Claude Code)
- Configure embedding provider
- Set up database connections
- Add context sources (dbt, Looker, Metabase, Notion, wikis)
- Build initial context
- Install agent integration
Project Structure
my-project/
├── ktx.yaml # Project configuration
├── semantic-layer/<connection-id>/ # YAML semantic sources
├── wiki/global/ # Shared business context
├── wiki/user/<user-id>/ # User-scoped notes
├── raw-sources/<connection-id>/ # Ingest artifacts
└── .ktx/ # Local state (git-ignored)
Commit ktx.yaml, semantic-layer/, and wiki/. Keep .ktx/ local.
Configuration
ktx.yaml Structure
version: "1.0"
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key_env: ANTHROPIC_API_KEY
embeddings:
provider: openai
model: text-embedding-3-small
api_key_env: OPENAI_API_KEY
database_connections:
warehouse:
type: postgres
host: localhost
port: 5432
database: analytics
username: readonly_user
password_env: WAREHOUSE_PASSWORD
context_sources:
dbt_main:
type: dbt
manifest_path: ./dbt/target/manifest.json
catalog_path: ./dbt/target/catalog.json
company_wiki:
type: local_wiki
path: ./docs
Database Connection Types
{
type: "postgres",
host: "localhost",
port: 5432,
database: "analytics",
username: "readonly",
password_env: "PG_PASSWORD"
}
{
type: "snowflake",
account: "xy12345.us-east-1",
warehouse: "ANALYTICS_WH",
database: "ANALYTICS",
schema: "PUBLIC",
username: "readonly",
password_env: "SNOWFLAKE_PASSWORD"
}
{
type: "bigquery",
project_id: "my-project",
dataset: "analytics",
credentials_path_env: "GOOGLE_APPLICATION_CREDENTIALS"
}
Context Source Types
dbt_main:
type: dbt
manifest_path: ./dbt/target/manifest.json
catalog_path: ./dbt/target/catalog.json
lookml_main:
type: lookml
project_path: ./lookml
looker_api:
type: looker
base_url: https://company.looker.com
client_id_env: LOOKER_CLIENT_ID
client_secret_env: LOOKER_CLIENT_SECRET
metabase_api:
type: metabase
base_url: https://metabase.company.com
username_env: METABASE_USER
password_env: METABASE_PASSWORD
notion_wiki:
type: notion
api_key_env: NOTION_API_KEY
page_ids:
- "a1b2c3d4e5f6"
local_docs:
type: local_wiki
path: ./docs
Key Commands
Context Building
ktx ingest
ktx ingest --connection warehouse
ktx ingest --force
Search Commands
ktx sl "revenue"
ktx sl "monthly active users"
ktx wiki "refund policy"
ktx wiki "customer segmentation"
ktx search "revenue metrics"
MCP Server
ktx mcp start
ktx mcp start --project-dir /path/to/project
ktx mcp status
Project Management
ktx status
ktx validate
ktx update
Usage Patterns
Setting Up for a New Project
Using with Claude Code
ktx mcp start --project-dir .
Creating Semantic Layer Definitions
metrics:
- name: revenue
label: Total Revenue
type: measure
sql: SUM(${orders.amount})
description: Total order revenue excluding refunds
- name: monthly_active_users
label: Monthly Active Users
type: metric
sql: COUNT(DISTINCT ${users.id})
filters:
- field: ${users.last_active_at}
operator: ">="
value: "CURRENT_DATE - INTERVAL '30 days'"
description: Users active in the last 30 days
dimensions:
- name: product_category
label: Product Category
type: string
sql: ${products.category}
- name: order_date
label: Order Date
type: time
sql: ${orders.created_at}
time_intervals: [day, week, month, quarter, year]
Adding Wiki Content
<!-- wiki/global/revenue-definitions.md -->
# Revenue Metrics
## Total Revenue
Sum of all order amounts excluding:
- Refunded orders
- Test orders (order_id < 1000)
- Internal employee orders
Formula: `SUM(orders.amount) WHERE status = 'completed' AND is_refunded = false`
## ARR (Annual Recurring Revenue)
Monthly recurring revenue × 12
Only includes subscription orders, not one-time purchases.
Querying from Agents
{
name: "ktx_sl_search",
input: { query: "revenue metrics" }
}
{
name: "ktx_wiki_search",
input: { query: "revenue definition" }
}
{
name: "ktx_get_metric",
input: { metric_name: "revenue" }
}
{
name: "ktx_list_metrics",
input: { connection_id: "warehouse" }
}
Programmatic Access
import { ContextEngine } from '@kaelio/ktx/context';
const engine = new ContextEngine({
projectDir: '/path/to/project',
llmConfig: {
provider: 'anthropic',
model: 'claude-sonnet-4-6',
apiKey: process.env.ANTHROPIC_API_KEY
}
});
const slResults = await engine.searchSemanticLayer('revenue');
console.log(slResults);
const wikiResults = await engine.searchWiki('refund policy');
console.log(wikiResults);
const metric = await engine.getMetric('warehouse', 'revenue');
console.log(metric);
Advanced Configuration
Multi-Warehouse Setup
database_connections:
production:
type: snowflake
account: prod.us-east-1
warehouse: ANALYTICS_WH
database: PROD
staging:
type: snowflake
account: staging.us-east-1
warehouse: ANALYTICS_WH
database: STAGING
context_sources:
dbt_prod:
type: dbt
connection: production
manifest_path: ./dbt/target/prod/manifest.json
dbt_staging:
type: dbt
connection: staging
manifest_path: ./dbt/target/staging/manifest.json
Custom Embedding Provider
embeddings:
provider: custom
endpoint: https://embeddings.company.com/v1/embed
api_key_env: CUSTOM_EMBEDDINGS_KEY
model: company-embeddings-v2
dimensions: 1536
LLM Gateway Configuration
llm:
provider: ai_gateway
endpoint: https://gateway.company.com/v1
api_key_env: GATEWAY_API_KEY
model: claude-sonnet-4-6
Common Workflows
Initial Project Setup
npm install -g @kaelio/ktx
cd ~/projects/analytics
ktx setup
export ANTHROPIC_API_KEY="your-key"
export WAREHOUSE_PASSWORD="your-password"
ktx ingest
ktx status
Adding a New Metric
cat > semantic-layer/warehouse/new-metric.yaml << EOF
metrics:
- name: customer_lifetime_value
label: Customer Lifetime Value
type: measure
sql: AVG(${customers.total_spent})
description: Average total spend per customer
EOF
ktx ingest --connection warehouse
ktx sl "lifetime value"
Updating Context from dbt
cd dbt
dbt run
dbt docs generate
cd ..
ktx ingest --source dbt_main
ktx sl "new_model_name"
Integrating with CI/CD
#!/bin/bash
npm install -g @kaelio/ktx
export ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY"
export WAREHOUSE_PASSWORD="$WAREHOUSE_PASSWORD"
ktx ingest --project-dir .
ktx validate
git add semantic-layer/ wiki/
git commit -m "Update ktx context [skip ci]"
git push
Troubleshooting
ktx status shows "LLM ready: no"
cat ktx.yaml | grep -A 5 "llm:"
echo $ANTHROPIC_API_KEY
ktx validate --check-llm
Database connection fails
ktx validate --check-connections
echo $WAREHOUSE_PASSWORD
psql -h localhost -U readonly -d analytics -c "SELECT 1"
cat ktx.yaml | grep -A 10 "database_connections:"
Context ingestion errors
ktx ingest --source dbt_main --verbose
ls -la dbt/target/manifest.json
ls -la dbt/target/catalog.json
ktx ingest --force --source dbt_main
MCP server won't start
ktx mcp status
pkill -f "ktx mcp"
ktx mcp start --log-level debug
ktx mcp start --project-dir $(pwd)
Semantic search returns no results
ktx ingest --rebuild-embeddings
cat ktx.yaml | grep -A 5 "embeddings:"
echo $OPENAI_API_KEY
ktx sl "exact_metric_name"
Wiki content not appearing
cat ktx.yaml | grep -A 5 "local_wiki"
ls -la wiki/global/
ktx ingest --source company_wiki --force
file wiki/global/*.md
Agent integration not working
ktx mcp status
cat ~/.claude/config.json
ktx mcp start --project-dir .
tail -f .ktx/logs/mcp-server.log
Environment Variables
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"
export OPENAI_API_KEY="sk-..."
export WAREHOUSE_PASSWORD="..."
export SNOWFLAKE_PASSWORD="..."
export BIGQUERY_CREDENTIALS="/path/to/bigquery-key.json"
export LOOKER_CLIENT_ID="..."
export LOOKER_CLIENT_SECRET="..."
export METABASE_USER="..."
export METABASE_PASSWORD="..."
export NOTION_API_KEY="secret_..."
export GATEWAY_API_KEY="..."
export CUSTOM_EMBEDDINGS_KEY="..."
export KTX_PROJECT_DIR="/path/to/project"
Development Commands
git clone https://github.com/kaelio/ktx.git
cd ktx
pnpm install
uv sync --all-groups
pnpm run build
pnpm run type-check
pnpm run test
uv run pytest -q
pnpm run link:dev
ktx-dev --help
pnpm run dead-code
Resources