| name | ktx-ai-data-agents-context-layer |
| description | Context layer for data agents that teaches AI how to query warehouses accurately with approved metrics, semantic layers, and business knowledge |
| triggers | ["set up ktx for data agent queries","configure ktx context layer for warehouse","build semantic layer with ktx","ingest warehouse metadata with ktx","connect claude to data warehouse with ktx","create ktx project for analytics agents","query warehouse metrics through ktx mcp","configure ktx semantic layer and wiki"] |
ktx AI Data Agents Context Layer
Skill by ara.so — MCP Skills collection.
ktx is an executable context layer for data and analytics agents. It teaches AI agents (Claude Code, Codex, Cursor, etc.) how to query your data warehouse accurately by combining approved metric definitions, joinable columns, and business knowledge from your entire data stack.
What ktx Does
ktx solves the problem of general-purpose agents struggling with data tasks. Instead of re-exploring your warehouse on every question and inventing metric logic, ktx:
- Learns from company knowledge — ingests wiki content, organizes it, removes duplicates, flags contradictions
- Maps the data stack — samples tables, captures metadata, detects joinable columns, annotates sources
- Builds a semantic layer — combines raw tables and high-level metrics through a join graph that resolves chasm and fan traps
- Serves agents at execution — exposes CLI and MCP tools with semantic search across wiki and semantic-layer entities
Supported databases: PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, SQLite
Integrations: dbt, MetricFlow, LookML, Looker, Metabase, Notion
Installation
Global CLI Installation
npm install -g @kaelio/ktx
Project-Specific Installation
npm install @kaelio/ktx
Quick Setup
ktx setup
This interactive command:
- Creates or resumes a local ktx project
- Configures LLM and embedding providers
- Sets up database connections
- Configures context sources (dbt, Looker, Metabase, Notion)
- Builds initial context
- Installs agent integration (MCP server)
Project Structure
my-project/
├── ktx.yaml # Project configuration
├── semantic-layer/<connection-id>/ # YAML semantic sources
├── wiki/global/ # Shared business context
├── wiki/user/<user-id>/ # User-scoped notes
├── raw-sources/<connection-id>/ # Ingest artifacts and reports
└── .ktx/ # Local state and secrets (git-ignored)
Commit: ktx.yaml, semantic-layer/, wiki/
Ignore: .ktx/ (contains secrets and local state)
Key Commands
Status and Health
ktx status
ktx validate
Building Context
ktx ingest
ktx ingest --connection-id warehouse
ktx ingest --force
ktx ingest --dry-run
Searching Context
ktx sl "monthly recurring revenue"
ktx sl "customer churn rate"
ktx wiki "refund policy"
ktx wiki "data retention rules"
ktx search "revenue recognition rules"
MCP Server (Agent Integration)
ktx mcp start
ktx mcp start --project-dir /path/to/project
ktx mcp status
Semantic Layer Management
ktx sl list
ktx sl show --entity-id mrr_metric
ktx sl validate
Configuration
ktx.yaml Example
version: 1
project:
name: analytics
description: Company analytics warehouse
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key_env: ANTHROPIC_API_KEY
embeddings:
provider: openai
model: text-embedding-3-small
api_key_env: OPENAI_API_KEY
connections:
warehouse:
type: postgres
host: localhost
port: 5432
database: analytics
schema: public
username_env: DB_USERNAME
password_env: DB_PASSWORD
read_only: true
context_sources:
dbt_main:
type: dbt
connection_id: warehouse
manifest_path: ./target/manifest.json
run_results_path: ./target/run_results.json
notion_docs:
type: notion
api_key_env: NOTION_API_KEY
database_id_env: NOTION_DATABASE_ID
Environment Variables
export ANTHROPIC_API_KEY=your-key-here
export OPENAI_API_KEY=your-key-here
export GOOGLE_CLOUD_PROJECT=your-project
export DB_USERNAME=readonly_user
export DB_PASSWORD=secure-password
export NOTION_API_KEY=secret_xxx
export NOTION_DATABASE_ID=xxx
export KTX_PROJECT_DIR=/path/to/project
Real-World Usage Patterns
Pattern 1: Setting Up for Analytics Team
import { spawn } from 'child_process';
const setupKtx = () => {
const ktx = spawn('ktx', ['setup'], { stdio: 'inherit' });
ktx.on('close', (code) => {
if (code === 0) {
console.log('ktx setup complete');
spawn('ktx', ['mcp', 'start'], { stdio: 'inherit' });
}
});
};
Pattern 2: Automated Context Ingestion
import { execSync } from 'child_process';
const refreshContext = () => {
try {
execSync('ktx validate', { stdio: 'inherit' });
execSync('ktx ingest', { stdio: 'inherit' });
const status = execSync('ktx status', { encoding: 'utf-8' });
console.log('Context refresh complete:', status);
} catch (error) {
console.error('Context refresh failed:', error);
process.exit(1);
}
};
refreshContext();
Pattern 3: Programmatic Semantic Layer Query
import { execSync } from 'child_process';
const findMetrics = (query: string) => {
const result = execSync(`ktx sl "${query}" --json`, { encoding: 'utf-8' });
return JSON.parse(result);
};
const revenueMetrics = findMetrics('revenue');
console.log('Found metrics:', revenueMetrics.map(m => m.name));
Pattern 4: Agent Integration Check
import { execSync } from 'child_process';
import * as fs from 'fs';
const verifyKtxReady = (projectDir: string) => {
try {
const status = execSync('ktx status', {
cwd: projectDir,
encoding: 'utf-8'
});
const checks = {
projectReady: status.includes('Project ready: yes'),
llmReady: status.includes('LLM ready: yes'),
contextBuilt: status.includes('ktx context built: yes'),
agentReady: status.includes('Agent integration ready: yes')
};
return Object.values(checks).every(v => v);
} catch (error) {
return false;
}
};
if (!verifyKtxReady('./')) {
console.error('ktx not ready - run: ktx setup');
process.exit(1);
}
Pattern 5: Custom Wiki Page Creation
import * as fs from 'fs';
import * as path from 'path';
const addWikiPage = (
projectDir: string,
title: string,
content: string,
global: boolean = true
) => {
const wikiDir = global
? path.join(projectDir, 'wiki', 'global')
: path.join(projectDir, 'wiki', 'user', process.env.USER || 'default');
fs.mkdirSync(wikiDir, { recursive: true });
const filename = title.toLowerCase().replace(/\s+/g, '-') + '.md';
const filepath = path.join(wikiDir, filename);
const markdown = `---
title: ${title}
created: ${new Date().toISOString()}
---
# ${title}
${content}
`;
fs.writeFileSync(filepath, markdown);
console.log(`Wiki page created: ${filepath}`);
};
addWikiPage('./', 'Revenue Recognition Policy', `
## Policy
Revenue is recognized when service is delivered, not when payment is received.
## Implementation
- Use the \`revenue_recognized_at\` timestamp
- Join with \`subscription_events\` table
- Filter by \`event_type = 'service_delivered'\`
`);
Semantic Layer YAML
ktx builds semantic layers automatically, but you can also define custom metrics:
version: 1
metrics:
- id: mrr
name: Monthly Recurring Revenue
description: Total monthly recurring revenue from active subscriptions
type: metric
sql: |
SELECT
DATE_TRUNC('month', subscription_start) as month,
SUM(monthly_value) as mrr
FROM subscriptions
WHERE status = 'active'
GROUP BY 1
dimensions:
- month
measures:
- mrr
joins:
- entity: customers
on: subscriptions.customer_id = customers.id
tags:
- revenue
- subscription
Common Troubleshooting
"ktx context not built"
ktx ingest
ktx validate
"LLM ready: no"
export ANTHROPIC_API_KEY=your-key
ktx setup
"Agent integration not ready"
ktx mcp start --project-dir .
ktx mcp status
Database Connection Fails
ktx ingest --connection-id warehouse --dry-run
Semantic Layer Not Found
ktx sl list
ktx ingest --connection-id warehouse --force
Integration with AI Agents
Claude Code / Codex / Cursor
From your project directory, tell the agent:
Run npx skills add Kaelio/ktx --skill ktx and use the ktx skill to install
and configure ktx in this project.
Or manually:
- Run
ktx setup in your project
- Ensure
ktx status shows all ready
- Start the agent - it will detect the MCP server automatically
- Ask: "What metrics are available?" or "Query revenue by month"
MCP Tools Available to Agents
When ktx MCP server is running, agents get these tools:
ktx_search_semantic_layer — search metrics, dimensions, entities
ktx_search_wiki — search business context and documentation
ktx_get_metric — get detailed metric definition
ktx_list_connections — list available database connections
ktx_query_warehouse — execute read-only SQL queries
Advanced Configuration
Multiple Connections
connections:
prod_warehouse:
type: snowflake
account: company.us-east-1
warehouse: ANALYTICS_WH
database: PROD
schema: PUBLIC
username_env: SNOWFLAKE_USER
password_env: SNOWFLAKE_PASSWORD
staging_warehouse:
type: postgres
host: staging-db.internal
port: 5432
database: staging
username_env: STAGING_DB_USER
password_env: STAGING_DB_PASSWORD
Custom LLM Backends
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key_env: ANTHROPIC_API_KEY
Telemetry Opt-Out
export KTX_TELEMETRY_DISABLED=1
telemetry:
enabled: false
Best Practices
- Keep .ktx/ out of version control — it contains secrets and local state
- Commit semantic-layer/ and wiki/ — share context across team
- Run ktx ingest regularly — daily or after schema changes
- Use read-only database users — ktx never writes, enforce at DB level
- Store API keys in environment variables — never commit to ktx.yaml
- Start MCP server before opening agent — check
ktx status output
- Review contradiction flags — when ktx finds conflicting definitions
- Tag metrics consistently — helps agents find relevant context
Resources