| name | wren-generate-mdl |
| description | Generate a Wren MDL project by exploring a database with available tools (SQLAlchemy, database drivers, MCP connectors, or raw SQL). Guides agents through schema discovery, type normalization, and MDL YAML generation using the wren CLI. Use when: user wants to create or set up a new MDL, onboard a new data source, or scaffold a project from an existing database. |
| license | Apache-2.0 |
| metadata | {"author":"wren-engine","version":"2.2"} |
Generate Wren MDL ā CLI Agent Workflow
Version check
Silently fetch https://raw.githubusercontent.com/Canner/wren-engine/main/skills/versions.json.
Compare the wren-generate-mdl key with this skill's version (from the frontmatter above).
If the remote version is newer, notify the user before proceeding:
A newer version of the wren-generate-mdl skill is available.
Update with:
npx skills add Canner/wren-engine --skill wren-generate-mdl
The CLI auto-detects your installed agent. To target a specific one, add --agent <name> (e.g., claude-code, cursor, windsurf, cline).
Then continue with the workflow below regardless of update status.
Builds an MDL project by discovering database schema and converting it
into Wren's YAML project format. The agent uses whatever database tools
are available in its environment for introspection; the wren CLI handles
type normalization, validation, and build.
For memory and query workflows after setup, see the wren-usage skill.
Prerequisites
wren CLI installed (pip install wren-engine[<datasource>])
- A working database connection (credentials available to the agent)
- A wren profile configured (
wren profile add) or connection info ready
Phase 0 ā Detect existing project
Goal: If the current directory is already inside a wren project, let the user decide how to proceed.
Check whether wren_project.yml exists in the current working directory
(or any parent up to the repository root). If found:
- Tell the user that an existing wren project was detected and show its path.
- Ask:
- Reset ā wipe the existing project (
models/, views/,
relationships.yml, instructions.md, and rebuild wren_project.yml)
and regenerate from scratch in the same directory.
- New path ā keep the existing project untouched and choose a
different directory for the new project. Ask the user for the new path,
then
wren context init --path <new_path> and continue from Phase 1
using that path.
If no existing project is detected, proceed directly to Phase 1.
Phase 1 ā Establish connection and scope
Goal: Confirm the agent can reach the database and agree on scope with the user.
-
Verify connectivity using whichever tool is available:
- If SQLAlchemy:
engine.connect() test
- If database driver: simple query like
SELECT 1
- If wren profile exists:
wren profile debug to check config
- If raw SQL via wren:
wren --sql "SELECT 1" (requires profile or connection file)
-
Ask the user:
- Which schema(s) or dataset(s) to include (skip if only one exists)
- Whether to include all tables or a subset
- The datasource type for wren (e.g.,
postgres, bigquery, snowflake) ā needed for type normalization dialect
Phase 2 ā Discover schema
Goal: Collect table names, column names, column types, and constraints.
Use whatever introspection method is available. Here are common approaches
ranked by convenience:
Option A: SQLAlchemy (recommended if available)
from sqlalchemy import create_engine, inspect
engine = create_engine(connection_url)
inspector = inspect(engine)
tables = inspector.get_table_names(schema="public")
for table in tables:
columns = inspector.get_columns(table, schema="public")
pk = inspector.get_pk_constraint(table, schema="public")
fks = inspector.get_foreign_keys(table, schema="public")
Option B: Database-specific driver
- psycopg / asyncpg (Postgres): Query
information_schema.columns and information_schema.table_constraints
- google-cloud-bigquery:
client.list_tables(), client.get_table() ā table.schema
- snowflake-connector-python:
SHOW COLUMNS IN TABLE, SHOW PRIMARY KEYS IN TABLE
- clickhouse-driver:
DESCRIBE TABLE, system.tables
Option C: Raw SQL via wren
If no driver is available but a wren profile is configured, query
information_schema through wren itself:
wren --sql "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" -o json
wren --sql "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'orders'" -o json
Note: this goes through the MDL layer, so it only works if you already
have a minimal MDL or if the database supports information_schema as
regular tables. For bootstrapping from zero, Option A or B is preferred.
Phase 3 ā Normalize types
Goal: Convert raw database types to wren-core-compatible types.
Python import (recommended for batch processing)
from wren.type_mapping import parse_type, parse_types
normalized = parse_type("character varying(255)", "postgres")
columns = [
{"column": "id", "raw_type": "int8"},
{"column": "name", "raw_type": "character varying"},
{"column": "total", "raw_type": "numeric(10,2)"},
]
normalized_cols = parse_types(columns, dialect="postgres")
CLI (if Python import not available)
Single type:
wren utils parse-type --type "character varying(255)" --dialect postgres
Batch (stdin JSON):
echo '[{"column":"id","raw_type":"int8"},{"column":"name","raw_type":"character varying"}]' \
| wren utils parse-types --dialect postgres
Phase 4 ā Scaffold and write MDL project
Goal: Create the YAML project structure.
Step 1 ā Initialize project
wren context init --path /path/to/project
This creates:
project/
āāā wren_project.yml
āāā models/
āāā views/
āāā relationships.yml
āāā instructions.md
IMPORTANT: catalog and schema in wren_project.yml
These are Wren Engine's internal namespace ā they are NOT the database's
native catalog or schema. Keep the defaults (catalog: wren, schema: public)
unless you are intentionally configuring a multi-project namespace.
Your database's actual catalog/schema is specified per-model in table_reference
(see Step 2). Do not copy database catalog/schema values into wren_project.yml.
Step 2 ā Write model files
For each table, create a YAML file under models/. Use snake_case
naming (the build step converts to camelCase automatically).
name: orders
table_reference:
catalog: ""
schema: public
table: orders
primary_key: order_id
columns:
- name: order_id
type: INTEGER
not_null: true
- name: customer_id
type: INTEGER
- name: total
type: "DECIMAL(10, 2)"
- name: status
type: VARCHAR
properties:
description: "Order status: pending, shipped, delivered, cancelled"
Step 3 ā Write relationships
From foreign key constraints discovered in Phase 2:
- name: orders_customers
models:
- orders
- customers
join_type: many_to_one
condition: "orders.customer_id = customers.customer_id"
Join type mapping:
- FK table ā PK table:
many_to_one
- PK table ā FK table:
one_to_many
- Unique FK:
one_to_one
- Junction table:
many_to_many
If no foreign keys were found, infer from naming conventions:
- Column
<table>_id or <table_singular>_id ā likely FK to <table>
- Ask the user to confirm inferred relationships
Step 4 ā Add descriptions (optional but valuable)
Ask the user to describe:
- Each model (1-2 sentences about what the table represents)
- Key columns (especially calculated fields or non-obvious names)
These descriptions are indexed by wren memory index and significantly
improve LLM query accuracy.
Phase 5 ā Validate and build
wren context validate --path /path/to/project
wren context validate --path /path/to/project --strict
wren context build --path /path/to/project
wren --sql "SELECT * FROM <model_name> LIMIT 1"
If validation fails, fix the reported issues and re-run. Common errors:
- Duplicate model/column names
- Missing primary key
- Relationship referencing non-existent model
- Invalid column type (try re-running through
parse_type)
Phase 6 ā Initialize memory
wren memory index
wren memory status
After this step, wren memory fetch and wren memory recall are
operational. See the wren-usage skill for query workflows.
Phase 7 ā Iterate with the user
The initial MDL is a starting point. Improve it by:
- Adding calculated columns based on business logic
- Adding views for common query patterns
- Refining descriptions based on actual query usage
- Adding access control (RLAC/CLAC) if needed
Each change follows: edit YAML ā wren context validate ā
wren context build ā wren memory index.
Quick reference
| Task | Command / Method |
|---|
| Discover tables | Agent's own tools (SQLAlchemy, driver, raw SQL) |
| Discover columns + types | Agent's own tools |
| Discover constraints | Agent's own tools |
| Normalize types (Python) | from wren.type_mapping import parse_type |
| Normalize types (CLI) | wren utils parse-type --type T --dialect D |
| Normalize types (batch) | wren utils parse-types --dialect D < columns.json |
| Scaffold project | wren context init |
| Write models | Create models/<name>/metadata.yml |
| Write relationships | Edit relationships.yml |
| Validate | wren context validate |
| Build manifest | wren context build |
| Test query | wren --sql "SELECT * FROM <model> LIMIT 1" |
| Index memory | wren memory index |
Things to avoid
- Do not hardcode database-specific type strings in MDL ā always normalize via
parse_type
- Do not skip validation before build ā invalid YAML produces broken manifests silently
- Do not guess column types ā introspect from the actual database
- Do not write relationships without confirming join conditions ā wrong conditions cause silent query errors
- Do not skip
wren memory index after build ā stale indexes degrade recall quality