| name | ktx-context-layer-data-agents |
| description | Build and query a context layer for AI data agents with ktx - auto-learning semantic layer, wiki, and MCP integration |
| triggers | ["set up ktx for data agent queries","configure ktx semantic layer","ingest database context with ktx","connect AI agent to warehouse using ktx","search ktx wiki and metrics","configure ktx MCP server","build ktx context from dbt","troubleshoot ktx agent integration"] |
ktx Context Layer for Data Agents
Skill by ara.so ā MCP Skills collection.
ktx is an executable context layer that teaches AI agents how to query data warehouses accurately. It automatically builds and maintains:
- Semantic layer with approved metric definitions, join graphs, and automatic fan/chasm trap resolution
- Wiki from company knowledge (dbt, Looker, Notion) with deduplication and contradiction detection
- MCP server exposing tools for agent execution via CLI or Model Context Protocol
Agents get one searchable surface instead of reinventing SQL on every prompt.
Installation
npm install -g @kaelio/ktx
npx @kaelio/ktx --help
Requirements:
- Node.js 18+
- SQL warehouse (PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, or SQLite)
- LLM provider API key (Anthropic, Google Vertex AI, or AI Gateway)
Quick Setup
ktx setup
ktx status
ktx ingest
ktx mcp start
Example ktx status output:
ktx project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main)
ktx context built: yes
Agent integration ready: yes (codex:project)
Project Structure
my-project/
āāā ktx.yaml # Main configuration
āāā semantic-layer/
ā āāā warehouse/ # Per-connection YAML semantic sources
ā āāā customers.yaml
ā āāā revenue.yaml
āāā wiki/
ā āāā global/ # Shared business context
ā ā āāā refund-policy.md
ā āāā user/<user-id>/ # User-scoped notes
āāā raw-sources/
ā āāā warehouse/ # Ingest artifacts and reports
āāā .ktx/ # Local state, git-ignored
Commit: ktx.yaml, semantic-layer/, wiki/
Ignore: .ktx/
Configuration (ktx.yaml)
Minimal Configuration
version: 1
project:
id: analytics-warehouse
name: Analytics Project
llm:
provider: anthropic
model: claude-sonnet-4-6
apiKeyEnv: ANTHROPIC_API_KEY
embeddings:
provider: openai
model: text-embedding-3-small
apiKeyEnv: OPENAI_API_KEY
databases:
warehouse:
type: postgres
host: db.example.com
port: 5432
database: analytics
user: ktx_reader
passwordEnv: WAREHOUSE_PASSWORD
ssl: true
contextSources:
- id: dbt_main
type: dbt
path: ./dbt/target/manifest.json
database: warehouse
Full Configuration with Multiple Sources
version: 1
project:
id: multi-source-analytics
name: Multi-Source Analytics
llm:
provider: vertex
model: claude-sonnet-4-6
region: us-central1
projectId: my-gcp-project
credentialsEnv: GOOGLE_APPLICATION_CREDENTIALS
embeddings:
provider: openai
model: text-embedding-3-small
apiKeyEnv: OPENAI_API_KEY
databases:
warehouse:
type: snowflake
account: xy12345.us-east-1
warehouse: COMPUTE_WH
database: ANALYTICS
schema: PUBLIC
user: KTX_USER
passwordEnv: SNOWFLAKE_PASSWORD
role: ANALYST
clickhouse:
type: clickhouse
host: clickhouse.example.com
port: 8123
database: events
user: readonly
passwordEnv: CLICKHOUSE_PASSWORD
contextSources:
- id: dbt_main
type: dbt
path: ./dbt/target/manifest.json
database: warehouse
enabled: true
- id: looker_lookml
type: lookml
path: ./looker-models
database: warehouse
enabled: true
- id: metabase_exports
type: metabase
path: ./metabase-export.json
database: warehouse
enabled: true
- id: company_wiki
type: notion
apiKeyEnv: NOTION_API_KEY
databaseId: a1b2c3d4e5f6
enabled: true
ingestion:
tableSampleSize: 1000
enableJoinDiscovery: true
enableColumnProfiling: true
Core Commands
Setup and Status
ktx setup
ktx status
ktx config validate
ktx config show
Context Building
ktx ingest
ktx ingest --connection warehouse
ktx ingest --verbose
ktx ingest --force
Searching Context
ktx sl "revenue"
ktx sl "active users" --limit 10
ktx wiki "refund policy"
ktx wiki "data retention" --connection warehouse
ktx describe semantic customers
ktx describe semantic revenue --connection warehouse
MCP Server
ktx mcp start
ktx mcp start --project-dir /path/to/project
ktx status | grep "Agent integration"
Agent Integration
Claude Code / Codex
From your project directory, tell the agent:
Run npx skills add Kaelio/ktx --skill ktx and use the ktx skill to install
and configure ktx in this project.
Or manually add to claude_desktop_config.json (macOS) or %APPDATA%\Claude\config.json (Windows):
{
"mcpServers": {
"ktx": {
"command": "ktx",
"args": ["mcp", "start", "--project-dir", "/absolute/path/to/project"]
}
}
}
Cursor / OpenCode
Add to MCP settings:
{
"mcpServers": {
"ktx": {
"command": "npx",
"args": ["@kaelio/ktx", "mcp", "start", "--project-dir", "/absolute/path/to/project"]
}
}
}
TypeScript API Usage
import { KtxClient } from '@kaelio/ktx';
const ktx = new KtxClient({
projectDir: '/path/to/project',
});
const metrics = await ktx.searchSemanticLayer({
query: 'revenue',
limit: 5,
});
console.log(metrics);
const wikiPages = await ktx.searchWiki({
query: 'refund policy',
limit: 3,
});
const customerMetric = await ktx.getSemanticSource({
connection: 'warehouse',
sourceId: 'customers',
});
const result = await ktx.query({
connection: 'warehouse',
sql: 'SELECT * FROM semantic.revenue WHERE date >= CURRENT_DATE - 7',
useSemanticLayer: true,
});
Semantic Layer Definition
Creating a Metric (YAML)
semantic-layer/warehouse/revenue.yaml:
type: metric
id: total_revenue
name: Total Revenue
description: Sum of all order amounts excluding refunds
sql: |
SUM(CASE
WHEN status != 'refunded'
THEN amount
ELSE 0
END)
baseTable: orders
tags:
- finance
- core
filters:
- dimension: status
operator: not_in
values: ['cancelled', 'fraud']
aggregation: sum
Creating a Dimension
semantic-layer/warehouse/customers.yaml:
type: dimension
id: customer_segment
name: Customer Segment
description: Customer lifecycle segment based on LTV
sql: |
CASE
WHEN lifetime_value > 10000 THEN 'enterprise'
WHEN lifetime_value > 1000 THEN 'mid-market'
ELSE 'smb'
END
baseTable: customers
dataType: string
tags:
- segmentation
Join Definition
type: join
id: orders_to_customers
from: orders
to: customers
relationship: many_to_one
sql: orders.customer_id = customers.id
requiredFilters: []
Wiki Content
Adding Business Context
wiki/global/refund-policy.md:
# Refund Policy
## Definition
A refund is issued when:
- Customer requests within 30 days
- Product is defective
- Service was not delivered
## Metrics Impact
- `refunded_revenue`: Revenue from orders with status = 'refunded'
- `net_revenue`: Total revenue excluding refunds
- `refund_rate`: refunded_revenue / total_revenue
## Related Tables
- `orders.status`: Use 'refunded' for refund detection
- `refunds`: Detailed refund records with reason codes
User-Scoped Notes
wiki/user/alice@example.com/weekly-metrics.md:
# Weekly Metrics Review Notes
## Active Users Definition
Per discussion with Product (2024-05-15):
- Use `events.user_id` not `users.id`
- Filter to `event_type IN ('page_view', 'feature_used')`
- 7-day rolling window
Query stored in: `semantic-layer/warehouse/active_users.yaml`
Common Patterns
Setting Up a New Project
mkdir analytics-project && cd analytics-project
ktx setup
cat >> ktx.yaml <<EOF
contextSources:
- id: dbt_main
type: dbt
path: ../dbt/target/manifest.json
database: warehouse
EOF
ktx ingest
ktx sl "customers"
ktx wiki "refund"
ktx mcp start
Connecting to Snowflake
databases:
warehouse:
type: snowflake
account: xy12345.us-east-1
warehouse: COMPUTE_WH
database: ANALYTICS
schema: PUBLIC
user: KTX_USER
passwordEnv: SNOWFLAKE_PASSWORD
role: ANALYST
authenticator: snowflake
export SNOWFLAKE_PASSWORD='your-password'
ktx ingest --connection warehouse
Connecting to BigQuery
databases:
warehouse:
type: bigquery
projectId: my-gcp-project
dataset: analytics
credentialsEnv: GOOGLE_APPLICATION_CREDENTIALS
location: US
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
ktx ingest --connection warehouse
Using Local LLM via AI Gateway
llm:
provider: ai-gateway
baseUrl: http://localhost:8080/v1
model: llama-3-70b
apiKeyEnv: AI_GATEWAY_API_KEY
Ingesting Notion as Wiki Source
contextSources:
- id: company_wiki
type: notion
apiKeyEnv: NOTION_API_KEY
databaseId: a1b2c3d4e5f6
enabled: true
export NOTION_API_KEY='secret_...'
ktx ingest
Troubleshooting
"Project ready: no"
ktx status
export ANTHROPIC_API_KEY='sk-ant-...'
export WAREHOUSE_PASSWORD='...'
ktx config validate
"Agent integration ready: no"
ktx status
ktx mcp start --project-dir /home/user/analytics
Database Connection Errors
ktx ingest --connection warehouse --verbose
Ingestion Fails with "No tables found"
cat ktx.yaml | grep -A 10 databases
Semantic Layer Not Resolving Joins
ls semantic-layer/warehouse/
ktx ingest --verbose
MCP Tools Not Available in Agent
ktx mcp start --project-dir /absolute/path
Slow Ingestion
ingestion:
tableSampleSize: 100
enableJoinDiscovery: false
enableColumnProfiling: false
ktx ingest --connection warehouse
Contradictions Flagged During Ingest
cat raw-sources/warehouse/reports/contradictions.json
Type Errors with TypeScript API
import type { SemanticSource, WikiPage } from '@kaelio/ktx';
const metric: SemanticSource = await ktx.getSemanticSource({
connection: 'warehouse',
sourceId: 'revenue',
});
Advanced Usage
Programmatic Project Setup
import { KtxProject } from '@kaelio/ktx';
const project = await KtxProject.create({
projectDir: '/path/to/project',
config: {
version: 1,
project: {
id: 'analytics',
name: 'Analytics Project',
},
llm: {
provider: 'anthropic',
model: 'claude-sonnet-4-6',
apiKeyEnv: 'ANTHROPIC_API_KEY',
},
embeddings: {
provider: 'openai',
model: 'text-embedding-3-small',
apiKeyEnv: 'OPENAI_API_KEY',
},
databases: {
warehouse: {
type: 'postgres',
host: 'localhost',
port: 5432,
database: 'analytics',
user: 'ktx',
passwordEnv: 'DB_PASSWORD',
},
},
contextSources: [],
},
});
await project.ingest();
Custom Embedding Search
const results = await ktx.searchSemanticLayer({
query: 'monthly recurring revenue',
limit: 5,
threshold: 0.7,
filters: {
tags: ['finance'],
connection: 'warehouse',
},
});
Exporting Context for External Use
ktx export semantic --connection warehouse > semantic-layer.json
ktx export wiki > wiki-export.tar.gz
Resources