| name | ktx-ai-data-context-layer |
| description | Install and configure ktx, an executable context layer for data and analytics agents that enables accurate warehouse queries through MCP with semantic layer and skills |
| triggers | ["set up ktx for data agent queries","configure ktx semantic layer for warehouse","install ktx context layer with MCP","help me query my data warehouse with ktx","set up ktx for Claude Code data analysis","configure ktx ingestion from dbt and warehouse","troubleshoot ktx mcp server connection","build ktx context from database metadata"] |
ktx AI Data Context Layer
Skill by ara.so — AI Agent Skills collection.
ktx is a self-improving context layer that teaches AI agents how to query your data warehouse accurately. It ingests metadata from databases, dbt, Looker, Metabase, and wikis to build a semantic layer with approved metric definitions, joinable columns, and business knowledge. Agents access this through MCP (Model Context Protocol) tools and CLI commands.
What ktx Does
- Learns from company knowledge — ingests wiki content, removes duplicates, flags contradictions
- Maps the data stack — samples tables, detects joinable columns, captures metadata and usage patterns
- Builds a semantic layer — combines raw tables and high-level metrics with automatic join resolution
- Serves agents at execution — exposes CLI and MCP tools with semantic search across wiki and semantic-layer entities
Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, and SQLite. Integrates with dbt, MetricFlow, LookML, Looker, Metabase, and Notion.
Installation
Global CLI Installation
npm install -g @kaelio/ktx
Project-Specific Installation
cd /path/to/your/project
npm install @kaelio/ktx
Verify Installation
ktx --version
ktx --help
Quick Setup
ktx setup
ktx status
The ktx setup command will:
- Create or resume a local ktx project
- Configure LLM and embedding providers
- Set up database connections
- Configure context sources (dbt, Looker, etc.)
- Build initial context
- Install agent integration (MCP)
Project Structure
my-project/
├── ktx.yaml # Project configuration
├── semantic-layer/<connection-id>/ # YAML semantic sources
├── wiki/global/ # Shared business context
├── wiki/user/<user-id>/ # User-scoped notes
├── raw-sources/<connection-id>/ # Ingest artifacts and reports
└── .ktx/ # Local state and secrets (git-ignored)
Important: Commit ktx.yaml, semantic-layer/, and wiki/. Keep .ktx/ local.
Configuration
ktx.yaml Structure
version: 1
name: my-analytics-project
llm:
provider: anthropic
model: claude-sonnet-4-6
embeddings:
provider: openai
model: text-embedding-3-small
databases:
warehouse:
type: postgres
host: localhost
port: 5432
database: analytics
context-sources:
dbt_main:
type: dbt
path: ./dbt-project
target: prod
looker_main:
type: looker
base_url: https://company.looker.com
agent-integration:
type: mcp
clients:
- codex:project
- claude-code
LLM Configuration
ktx supports multiple LLM providers:
ktx setup
Set API keys via environment variables:
export ANTHROPIC_API_KEY=your-key-here
export OPENAI_API_KEY=your-key-here
Database Connection
Example PostgreSQL connection in ktx.yaml:
databases:
warehouse:
type: postgres
host: localhost
port: 5432
database: analytics
schema: public
ssl: true
Credentials in .ktx/secrets.yaml (auto-generated, git-ignored):
databases:
warehouse:
user: readonly_user
password: ${DB_PASSWORD}
Context Sources
dbt Integration
context-sources:
dbt_main:
type: dbt
path: ./dbt-project
target: prod
manifest_path: ./dbt-project/target/manifest.json
Looker Integration
context-sources:
looker_main:
type: looker
base_url: https://company.looker.com
Notion Integration
context-sources:
company_wiki:
type: notion
database_ids:
- abc123def456
Key Commands
Setup and Status
ktx setup
ktx status
Building Context
ktx ingest
ktx ingest --connection warehouse
ktx ingest --source dbt_main
ktx ingest --force
Searching Context
ktx sl "revenue"
ktx sl "monthly active users"
ktx wiki "refund policy"
ktx wiki "data retention"
ktx search "customer churn"
MCP Server
ktx mcp start
ktx mcp start --project-dir /path/to/project
ktx mcp status
ktx mcp stop
Semantic Layer Management
ktx sl list
ktx sl validate
ktx sl show users
ktx sl show revenue_metrics
Project Management
ktx init
ktx config set llm.provider anthropic
ktx config set llm.model claude-sonnet-4-6
ktx config show
ktx clean --cache
ktx clean --all
Real Usage Examples
Example 1: Set Up ktx for PostgreSQL Warehouse
npm install -g @kaelio/ktx
cd ~/my-analytics-project
export ANTHROPIC_API_KEY=sk-ant-...
ktx setup
ktx status
Example 2: Ingest dbt and Build Context
cat >> ktx.yaml <<EOF
context-sources:
dbt_main:
type: dbt
path: ./dbt-project
target: prod
EOF
ktx ingest
ktx sl "dim_customers"
ktx sl "fct_orders"
Example 3: Query Semantic Layer from TypeScript
import { execSync } from 'child_process';
function searchSemanticLayer(query: string): string {
const result = execSync(`ktx sl "${query}"`, { encoding: 'utf-8' });
return result;
}
const revenueMetrics = searchSemanticLayer('revenue');
console.log(revenueMetrics);
function searchWiki(query: string): string {
const result = execSync(`ktx wiki "${query}"`, { encoding: 'utf-8' });
return result;
}
const refundPolicy = searchWiki('refund policy');
console.log(refundPolicy);
Example 4: Set Up MCP for Claude Code
cd ~/my-analytics-project
ktx setup
ktx mcp start --project-dir ~/my-analytics-project
Example 5: Create Semantic Layer Definition
Create semantic-layer/warehouse/customers.yaml:
kind: SemanticSource
name: customers
type: entity
description: Customer dimension with lifetime metrics
columns:
- name: customer_id
type: dimension
data_type: integer
primary_key: true
- name: email
type: dimension
data_type: string
- name: created_at
type: dimension
data_type: timestamp
- name: lifetime_value
type: metric
data_type: numeric
sql: SUM(order_total)
aggregation: sum
- name: total_orders
type: metric
data_type: integer
sql: COUNT(DISTINCT order_id)
aggregation: count
sql: |
SELECT
c.id as customer_id,
c.email,
c.created_at,
COALESCE(SUM(o.total), 0) as lifetime_value,
COUNT(o.id) as total_orders
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.email, c.created_at
joins:
- to: orders
type: one_to_many
on: customers.customer_id = orders.customer_id
Validate and ingest:
ktx sl validate
ktx ingest --force
ktx sl show customers
Example 6: Add Wiki Context
Create wiki/global/data-definitions.md:
# Data Definitions
## Revenue Recognition
Revenue is recognized at the point of sale, not when payment is received.
**Key Metrics:**
- `gross_revenue`: Total sales before discounts
- `net_revenue`: Sales minus refunds and discounts
- `arr`: Annual Recurring Revenue (subscription only)
## Customer Classification
- **Active**: Made a purchase in last 90 days
- **Churned**: No purchase in 90+ days
- **New**: First purchase within last 30 days
## Refund Policy
Refunds are processed within 7 days and deducted from revenue in the same period.
Ingest wiki:
ktx ingest
ktx wiki "revenue recognition"
Common Patterns
Pattern 1: Daily Context Refresh
#!/bin/bash
cd /path/to/ktx/project
ktx ingest --force
ktx sl validate
Run via cron:
0 2 * * * /path/to/refresh-context.sh
Pattern 2: Multi-Environment Setup
databases:
warehouse_prod:
type: postgres
host: prod.example.com
database: analytics
warehouse_staging:
type: postgres
host: staging.example.com
database: analytics
ktx ingest --connection warehouse_prod
ktx ingest --connection warehouse_staging
Pattern 3: CI/CD Validation
name: Validate ktx Semantic Layer
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install ktx
run: npm install -g @kaelio/ktx
- name: Validate semantic layer
run: ktx sl validate
Pattern 4: Environment-Specific Secrets
ANTHROPIC_API_KEY=sk-ant-prod-...
DB_PASSWORD=prod_password
ANTHROPIC_API_KEY=sk-ant-staging-...
DB_PASSWORD=staging_password
Load before running:
source .env.production
ktx ingest
Troubleshooting
Issue: ktx command not found
Solution: Ensure global installation or use npx:
npm install -g @kaelio/ktx
npx @kaelio/ktx setup
Issue: MCP server not connecting
Solution: Check server is running and project directory is correct:
ktx mcp status
ktx mcp start --project-dir /absolute/path/to/project
Issue: Database connection fails
Solution: Verify credentials and network access:
psql -h localhost -p 5432 -U readonly_user -d analytics
ktx config show
ktx setup
Issue: Ingestion fails with "LLM error"
Solution: Verify API key and quota:
echo $ANTHROPIC_API_KEY
ktx config show
ktx config set llm.model claude-sonnet-3-5-20240620
ktx ingest
Issue: Semantic layer validation errors
Solution: Check YAML syntax and required fields:
ktx sl validate --verbose
Issue: Wiki search returns no results
Solution: Verify wiki ingestion completed:
ktx ingest --force
ls -la wiki/global/
ktx config show | grep embeddings
Issue: Slow ingestion performance
Solution: Use incremental ingestion and tune sampling:
ktx ingest
databases:
warehouse:
sampling:
max_rows: 1000
Issue: Agent can't find ktx MCP tools
Solution: Ensure MCP server is running and configured:
ktx mcp start --project-dir $(pwd) &
Advanced Usage
Custom LLM Provider (AI Gateway)
llm:
provider: ai-gateway
endpoint: https://gateway.example.com/v1/chat/completions
model: custom-model
api_key_env: CUSTOM_API_KEY
Multiple Semantic Layer Sources
context-sources:
dbt_core:
type: dbt
path: ./dbt-core
dbt_marketing:
type: dbt
path: ./dbt-marketing
metabase:
type: metabase
url: https://metabase.example.com
Programmatic Access (Node.js)
import { spawn } from 'child_process';
async function queryKtx(query: string): Promise<string> {
return new Promise((resolve, reject) => {
const proc = spawn('ktx', ['search', query]);
let output = '';
proc.stdout.on('data', (data) => {
output += data.toString();
});
proc.on('close', (code) => {
if (code === 0) {
resolve(output);
} else {
reject(new Error(`ktx exited with code ${code}`));
}
});
});
}
const results = await queryKtx('customer metrics');
console.log(results);
Project Resolution
ktx resolves the project directory in this order:
--project-dir flag
KTX_PROJECT_DIR environment variable
- Nearest
ktx.yaml (walk up from current directory)
- Current directory
ktx status --project-dir /path/to/project
export KTX_PROJECT_DIR=/path/to/project
ktx status
cd /path/to/project
ktx status
Resources