Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

redshift-lineage-from-stl

Reconstruct ACTUAL table-to-table data flow from Redshift query history — catches ad-hoc, BI-tool (Tableau / Looker), and manual-fix usage dbt manifest cannot see. Read-only. Use when auditing who reads / writes a table, or before deprecating one. Do NOT use for dbt-internal lineage (use dbt manifest) or real-time monitoring. Triggers: /redshift-lineage-from-stl / actual lineage / who reads / Tableau usage / 實際 lineage / 誰在讀 / クエリ履歴.

Ejecutar en Manus

Resumen

Comando de instalación

npx skills add https://github.com/kouko/redshift-comment-mcp --skill redshift-lineage-from-stl

Copia y pega este comando en Claude Code para instalar la habilidad

Fuente

kouko/redshift-comment-mcp

Estrellas1

Forks0

Actualizado10 de mayo de 2026, 06:29

Explorador de archivos

6 archivos

SKILL.md

readonly

name

redshift-lineage-from-stl

description

Redshift Lineage from STL_QUERY

Reconstructs actual lineage by mining query history + sqlglot parsing. Ships a helper script — the only one in this plugin that does, justified because LLM-driven SQL parsing is unreliable.

STL retention warning (mention upfront in chat):

Provisioned: ~2-5 days; older queries gone.
Serverless: longer; uses SYS_QUERY_HISTORY.
Permissions: STL needs SYSLOG ACCESS UNRESTRICTED or admin.

When to use / NOT

Use to find ad-hoc / BI-tool consumers, audit deprecation candidates, diagnose freshness gaps.
NOT for dbt-internal lineage (manifest is authoritative); NOT for declared FKs (use erd); NOT real-time.

Inputs

Form	Behavior
`--since <date>` or `--since 7d`	window (clamped to STL retention)
`--table <s>.<t>`	scope to queries touching this table
`--user <name>`	scope to one user
`--limit N`	cap pulled queries (default 5000)
`--include-system`	keep pg_catalog/STL/etc (default: filter out)
`--output mermaid`	also emit Mermaid graph

Flow

Detect cluster type: execute_sql("SELECT version()"). Look for "Redshift Serverless" → use SYS_QUERY_HISTORY; else STL.

Pull queries — provisioned (STL_QUERY + STL_QUERYTEXT):

WITH recent AS (
    SELECT q.query, q.userid, q.starttime,
           u.usename AS user_name
    FROM STL_QUERY q LEFT JOIN PG_USER u ON u.usesysid = q.userid
    WHERE q.starttime >= '<since>'::timestamp AND q.aborted = 0
      /* if --user */ AND u.usename = '<user>'
    ORDER BY q.starttime DESC LIMIT <limit>
)
SELECT r.query, r.user_name, r.starttime,
       LISTAGG(qt.text, '') WITHIN GROUP (ORDER BY qt.sequence) AS sql_text
FROM recent r JOIN STL_QUERYTEXT qt ON qt.query = r.query
GROUP BY r.query, r.user_name, r.starttime;

Caveat: LISTAGG result is VARCHAR(65535) — long queries truncate. Mention to user; truncated SQL parses worse.

Serverless (SYS_QUERY_HISTORY — full SQL is one column already):

SELECT query_id AS query, user_id AS user_name,
       start_time AS starttime, query_text AS sql_text
FROM SYS_QUERY_HISTORY
WHERE start_time >= '<since>'::timestamp
  AND status NOT IN ('failed', 'aborted')
  /* if --user */ AND user_id = '<user>'
ORDER BY start_time DESC LIMIT <limit>;

Save NDJSON + run helper: write each query as a JSON line {query, user, starttime, sql} to $TMPDIR/queries.ndjson, then:
```
cp "<SKILL_DIR>/assets/parse_stl_lineage.py" "$TMPDIR/"
"$TMPDIR/parse_stl_lineage.py" --input "$TMPDIR/queries.ndjson" \
    --output "$TMPDIR/lineage.json" \
    [--filter-table <s>.<t>] [--include-system]
```
Script handles multi-statement SQL (sqlglot.parse), CTE-name filtering, system-table filtering, per-edge aggregation. PEP 723 + uv run --script resolves sqlglot automatically.

Note on sql_text escaping. Some MCP server / transport layers serialize sql_text with literal \n / \r / \t (two-char escape sequences) instead of real control characters. The helper decodes these defensively (idempotent), so writing the field verbatim is fine. If you inspect the NDJSON manually and see literal \n in the SQL, that's the cause — not a bug in your write.
Read lineage.json and render. Surface parse_errors verbatim.

Output

Adjacency table (chat, primary):

| from | → | to | queries | distinct users | last seen |
| dbt_marts.fct_orders | → | (read by ad-hoc) | 243 | 12 | 2026-05-03 11:42:08 |
| dbt_staging.stg_orders | → | dbt_marts.fct_orders | 31 | 1 (dbt) | 2026-05-03 04:00:12 |

(read by ad-hoc) for SELECTs that don't write to a target.

Mermaid (only with --output mermaid):

flowchart LR
    raw_orders.events --> dbt_staging.stg_orders
    dbt_staging.stg_orders --> dbt_marts.fct_orders
    dbt_marts.fct_orders -. ad-hoc 243× .-> ((readers))

Footer: "Mined N queries from to . Found M edges across K tables. P parse errors. Earliest STL row: (warn if --since exceeds retention)."

Anti-patterns

NEVER parse SQL with regex — multi-statement SQL, CTEs, comments, and quoted identifiers all break naive matching. Use the bundled sqlglot helper script.
NEVER trust results past STL retention (~2-5 days provisioned) without warning — older rows silently disappear, biasing edge counts and "last seen" timestamps.
NEVER mix STL_QUERY (provisioned) and SYS_QUERY_HISTORY (serverless) in one report — different schemas; pick one based on version() detection.
NEVER omit the truncated-SQL warning when LISTAGG hits VARCHAR(65535) — partial parses produce phantom edges. Surface the truncation visibly.
NEVER hide parse_errors — list them verbatim so the user can audit what was missed and decide whether the result is trustworthy.

Errors

Condition	Behavior
STL access denied	`_error: stl_access_denied: <verbatim>`
`--since` > earliest STL row	warn, proceed
Helper script missing	`_error: helper_script_missing`
Helper script crashed	`_error: parser_failed: <stderr>`
`uv` not available	`_error: uv_unavailable`
Zero queries in window	not error — render empty + hint widen window
> 50% queries failed parse	warn loudly, proceed with what parsed

Interactive walkthrough for an unfamiliar Redshift cluster — schema → table → column, picked by reading comments. Hands off to /redshift-profile. Use when user doesn't know where to start in a cluster. Do NOT use when user already knows the table.column (use /redshift-profile directly) or in non-interactive contexts. Triggers: /redshift-explore / browse Redshift / where do I look / 找 cluster / 從哪開始 / 探検 / ガイド付き探索.

2026-05-101

redshift-grep-columns

kouko/redshift-comment-mcp

Cross-table column search — finds every column whose name or comment matches a keyword across all tables in one (or all) schemas via one schema-wide MCP call per schema. Use when user wants to find FK / shared-key columns across many tables (e.g. before composing a JOIN), or to audit column-naming consistency. Do NOT use for a single known table (use search_columns directly), single-column lookup (use get_column_comment), or table-name search (use /redshift-grep-tables). Triggers: /redshift-grep-columns / find column / search columns across tables / where is foo column / 跨表找欄位 / 哪些表有 foo / カラム横断検索 / カラム名検索.

2026-05-101

redshift-grep-tables

kouko/redshift-comment-mcp

Cross-schema table search — finds every table whose name or comment matches a keyword across all schemas via one cluster-wide MCP call. Use when user is looking for a table by topic but unsure which schema (e.g. "where is the orders fact table"), or auditing table-naming consistency. Do NOT use when schema is already known (use search_tables directly), for column-level search (use /redshift-grep-columns), or for general schema browsing (use /redshift-explore). Triggers: /redshift-grep-tables / find table / search tables / which schema has / 哪個 schema 有 / 跨 schema 找表 / テーブル横断検索 / テーブル名検索.

2026-05-101

redshift-profile

kouko/redshift-comment-mcp

Profile a Redshift column — cardinality, top-N values, null rate, min/max, plus comment. Read-only. Use when about to write CREATE TABLE or dbt schema.yml based on column assumptions, or to check whether a column is an enum. Do NOT use for full row counts (use execute_sql), schema/table search (use search_columns), or free-text columns where top-100 is noise. Triggers: /redshift-profile / profile column / distinct values / enum / 欄位分布 / カラムプロファイル / 値の分布.

2026-05-101

redshift-setup

kouko/redshift-comment-mcp

Configure a Redshift connection profile for the redshift-comment-mcp plugin via chat — password stored in OS keychain via system dialog or terminal handoff (never in chat history), connection verified. Named form `/redshift-setup <name>` adds a non-default profile and offers to activate it. Use when setting up first Redshift cluster or adding another. Do NOT use for switching between already-configured profiles (use /redshift-switch-profile), password-only changes (use set-password CLI), or profile deletion (use delete-profile). Triggers: /redshift-setup / set up Redshift / add cluster / 設定 Redshift / 新增 cluster / 接続を設定 / プロファイル.

2026-05-081

redshift-switch-profile

kouko/redshift-comment-mcp

Switch the active Redshift profile by flipping the active-profile pointer file at ~/.config/redshift-comment-mcp/active-profile (no host / user / password re-entry). Verifies connection before declaring success. Single-profile users get a friendly bow-out pointing to /redshift-setup. Use when switching between 2+ already- configured Redshift clusters. Do NOT use for setting up a new profile (use /redshift-setup), changing passwords (use set-password CLI), or single-profile installs (skill bows out). Triggers: /redshift-switch-profile / switch cluster / change Redshift / 切換 cluster / 切換 profile / クラスタ切替 / 接続を切替.

2026-05-081

Fuente

kouko

kouko/redshift-comment-mcp

Abrir repositorio de GitHub Ver repositorios del creador

Comando de instalación

Descarga

Ejecutar en Manus

Útil paraSOC

Arquitectos de bases de datosOcupaciones informáticas y matemáticas15-1243L4

name

redshift-lineage-from-stl

description

Redshift Lineage from STL_QUERY

Reconstructs actual lineage by mining query history + sqlglot parsing. Ships a helper script — the only one in this plugin that does, justified because LLM-driven SQL parsing is unreliable.

STL retention warning (mention upfront in chat):

Provisioned: ~2-5 days; older queries gone.
Serverless: longer; uses SYS_QUERY_HISTORY.
Permissions: STL needs SYSLOG ACCESS UNRESTRICTED or admin.

When to use / NOT

Use to find ad-hoc / BI-tool consumers, audit deprecation candidates, diagnose freshness gaps.
NOT for dbt-internal lineage (manifest is authoritative); NOT for declared FKs (use erd); NOT real-time.

Inputs

Form	Behavior
`--since <date>` or `--since 7d`	window (clamped to STL retention)
`--table <s>.<t>`	scope to queries touching this table
`--user <name>`	scope to one user
`--limit N`	cap pulled queries (default 5000)
`--include-system`	keep pg_catalog/STL/etc (default: filter out)
`--output mermaid`	also emit Mermaid graph

Flow

Detect cluster type: execute_sql("SELECT version()"). Look for "Redshift Serverless" → use SYS_QUERY_HISTORY; else STL.

Pull queries — provisioned (STL_QUERY + STL_QUERYTEXT):

WITH recent AS (
    SELECT q.query, q.userid, q.starttime,
           u.usename AS user_name
    FROM STL_QUERY q LEFT JOIN PG_USER u ON u.usesysid = q.userid
    WHERE q.starttime >= '<since>'::timestamp AND q.aborted = 0
      /* if --user */ AND u.usename = '<user>'
    ORDER BY q.starttime DESC LIMIT <limit>
)
SELECT r.query, r.user_name, r.starttime,
       LISTAGG(qt.text, '') WITHIN GROUP (ORDER BY qt.sequence) AS sql_text
FROM recent r JOIN STL_QUERYTEXT qt ON qt.query = r.query
GROUP BY r.query, r.user_name, r.starttime;

Caveat: LISTAGG result is VARCHAR(65535) — long queries truncate. Mention to user; truncated SQL parses worse.

Serverless (SYS_QUERY_HISTORY — full SQL is one column already):

SELECT query_id AS query, user_id AS user_name,
       start_time AS starttime, query_text AS sql_text
FROM SYS_QUERY_HISTORY
WHERE start_time >= '<since>'::timestamp
  AND status NOT IN ('failed', 'aborted')
  /* if --user */ AND user_id = '<user>'
ORDER BY start_time DESC LIMIT <limit>;

Save NDJSON + run helper: write each query as a JSON line {query, user, starttime, sql} to $TMPDIR/queries.ndjson, then:
```
cp "<SKILL_DIR>/assets/parse_stl_lineage.py" "$TMPDIR/"
"$TMPDIR/parse_stl_lineage.py" --input "$TMPDIR/queries.ndjson" \
    --output "$TMPDIR/lineage.json" \
    [--filter-table <s>.<t>] [--include-system]
```
Script handles multi-statement SQL (sqlglot.parse), CTE-name filtering, system-table filtering, per-edge aggregation. PEP 723 + uv run --script resolves sqlglot automatically.

Note on sql_text escaping. Some MCP server / transport layers serialize sql_text with literal \n / \r / \t (two-char escape sequences) instead of real control characters. The helper decodes these defensively (idempotent), so writing the field verbatim is fine. If you inspect the NDJSON manually and see literal \n in the SQL, that's the cause — not a bug in your write.
Read lineage.json and render. Surface parse_errors verbatim.

Output

Adjacency table (chat, primary):

| from | → | to | queries | distinct users | last seen |
| dbt_marts.fct_orders | → | (read by ad-hoc) | 243 | 12 | 2026-05-03 11:42:08 |
| dbt_staging.stg_orders | → | dbt_marts.fct_orders | 31 | 1 (dbt) | 2026-05-03 04:00:12 |

(read by ad-hoc) for SELECTs that don't write to a target.

Mermaid (only with --output mermaid):

flowchart LR
    raw_orders.events --> dbt_staging.stg_orders
    dbt_staging.stg_orders --> dbt_marts.fct_orders
    dbt_marts.fct_orders -. ad-hoc 243× .-> ((readers))

Footer: "Mined N queries from to . Found M edges across K tables. P parse errors. Earliest STL row: (warn if --since exceeds retention)."

Anti-patterns

NEVER parse SQL with regex — multi-statement SQL, CTEs, comments, and quoted identifiers all break naive matching. Use the bundled sqlglot helper script.
NEVER trust results past STL retention (~2-5 days provisioned) without warning — older rows silently disappear, biasing edge counts and "last seen" timestamps.
NEVER mix STL_QUERY (provisioned) and SYS_QUERY_HISTORY (serverless) in one report — different schemas; pick one based on version() detection.
NEVER omit the truncated-SQL warning when LISTAGG hits VARCHAR(65535) — partial parses produce phantom edges. Surface the truncation visibly.
NEVER hide parse_errors — list them verbatim so the user can audit what was missed and decide whether the result is trustworthy.

Errors

Condition	Behavior
STL access denied	`_error: stl_access_denied: <verbatim>`
`--since` > earliest STL row	warn, proceed
Helper script missing	`_error: helper_script_missing`
Helper script crashed	`_error: parser_failed: <stderr>`
`uv` not available	`_error: uv_unavailable`
Zero queries in window	not error — render empty + hint widen window
> 50% queries failed parse	warn loudly, proceed with what parsed

redshift-lineage-from-stl

Redshift Lineage from STL_QUERY

When to use / NOT

Inputs

Flow

Output

Anti-patterns

Errors

See also

Redshift Lineage from STL_QUERY

When to use / NOT

Inputs

Flow

Output

Anti-patterns

Errors

See also

redshift-lineage-from-stl

Redshift Lineage from STL_QUERY

When to use / NOT

Inputs

Flow

Output

Anti-patterns

Errors

See also

Más de este repositorio

Más de este repositorio

Redshift Lineage from STL_QUERY

When to use / NOT

Inputs

Flow

Output

Anti-patterns

Errors

See also