تشغيل أي مهارة في Manus بنقرة واحدة

$pwd:

sql-workflow

Name: Sql Workflow
Author: SignalPilot-Labs

// Use this skill before writing any SQL query. Covers: output shape inference (cardinality clues from the question), efficient schema exploration, iterative CTE-based query building, structured verification loop (row count, NULL audit, fan-out check, sample inspection), error recovery protocol, saving output to result.sql and result.csv, turn budget management, and common benchmark traps.

تشغيل في Manus

$ git log --oneline --stat

stars:٤٦٢

forks:٢٢

updated:١٤ أبريل ٢٠٢٦ في ٠٧:٣٨

SKILL.md

readonly

related-skills.json

نفس المستودع

notion-context.md

from "SignalPilot-Labs/SignalPilot"

Gather business context from Notion before dbt builds. Searches pages, extracts definitions/decisions/constraints, writes structured context for the build agent and notion-verify subagent.

2026-05-19462

dbt-workflow.md

from "SignalPilot-Labs/SignalPilot"

Load at Step 1 before exploring the project. Covers Notion context gathering, output shape inference, incremental model handling, and what to trust in YML.

2026-05-16462

dbt-debugging.md

from "SignalPilot-Labs/SignalPilot"

Load when dbt run or dbt parse fails. Covers YML duplicate patches, ref errors, passthrough model warnings, current_date fixes, DuckDB error messages, and zero-row diagnosis.

2026-04-20462

dbt-write.md

from "SignalPilot-Labs/SignalPilot"

Load at Step 4 when writing SQL models. Covers column naming, type preservation, JOIN defaults, lookup joins, sibling models, materialization, packages, and filtering rules.

2026-04-20462

duckdb-sql.md

from "SignalPilot-Labs/SignalPilot"

Load when hitting DuckDB syntax errors or writing DuckDB-specific SQL. Covers gotchas that differ from PostgreSQL/MySQL.

2026-04-20462

bigquery-sql.md

from "SignalPilot-Labs/SignalPilot"

BigQuery-specific SQL patterns: UNNEST for array expansion, STRUCT, ARRAY_AGG, DATE_DIFF/DATE_ADD, backtick-quoted table references, EXCEPT/REPLACE in SELECT, approximate aggregation, partitioned and wildcard tables.

2026-04-14462

package.json

"author": "SignalPilot-Labs"

"repository": "SignalPilot-Labs/SignalPilot"

فتح مستودع GitHub عرض مستودعات المنشئ

$ install --global

$ download --local

تشغيل في Manus

$ useful --forSOC

مطوّرو البرمجياتمهن الحاسوب والرياضيات15-1252L4

name	sql-workflow
description	Use this skill before writing any SQL query. Covers: output shape inference (cardinality clues from the question), efficient schema exploration, iterative CTE-based query building, structured verification loop (row count, NULL audit, fan-out check, sample inspection), error recovery protocol, saving output to result.sql and result.csv, turn budget management, and common benchmark traps.
type	skill

SQL Workflow Skill

1. Schema Exploration — Do This First

Before writing any SQL, understand the data:

Read local schema files first (if schema/ directory exists in workdir):
- schema/DDL.csv — all CREATE TABLE statements (if it exists)
- schema/{table_name}.json — column names, types, descriptions, sample values Reading these files costs zero tool calls and gives you table structure + sample data. Only call MCP tools for information not in the local files (e.g., row counts, live data exploration).
Call list_tables to get all schemas and tables — only if no local schema files exist or you need row counts.
Call describe_table on the tables that seem relevant to the question (only if JSON files lack detail)
Call explore_column on categorical columns to see distinct values (for filtering/grouping)
Call find_join_path if you need to join tables and the relationship is unclear

Stop exploring after 3-5 tool calls. Write SQL based on what you've found.

2. Output Shape Inference — Before Writing SQL

Read the task question carefully for cardinality clues:

"for each X" → GROUP BY X, one output row per X
"top N" / "top 5" → LIMIT N or QUALIFY RANK() <= N
"total / sum / average" → single row aggregate
"list all" → detail rows, no aggregation
"how many" → COUNT, result is 1 row 1 column

Write a comment at the top of your SQL:

-- EXPECTED: <row count estimate> rows because <reason from question>

Critical checks:

If the question asks for a single number, the result MUST be 1 row × 1 column
If the question says "how many", verify the CSV has exactly 1 row with a COUNT value
If "top N" appears in the question, verify the CSV has at most N rows

3. Iterative Query Building — Build Bottom-Up

Do NOT write a 50-line query and run it all at once:

Write the innermost subquery or first CTE first
Run it standalone with query_database — verify row count and sample values
Add the next CTE, verify again
Continue until the full query is built

Example incremental pattern:

-- Step 1: verify source
SELECT COUNT(*) FROM orders WHERE status = 'completed';

-- Step 2: verify join partner cardinality
SELECT COUNT(*), COUNT(DISTINCT customer_id) FROM orders;

-- Step 3: build first CTE, verify
WITH order_totals AS (
  SELECT customer_id, SUM(amount) AS total
  FROM orders
  GROUP BY customer_id
)
SELECT COUNT(*), COUNT(DISTINCT customer_id) FROM order_totals;

-- Step 4: add final aggregation

4. Execution and Structured Verification

mcp__signalpilot__query_database
  connection_name="<task_connection_name>"
  sql="SELECT ..."

After executing, run these checks IN ORDER before saving:

Row count sanity: Does 0 rows make sense? Does 1M rows make sense for a "top 10" question?
Column count: Does the result have the right number of columns for the question?
NULL audit: For each key column — unexpected NULLs indicate wrong JOINs:
```
SELECT COUNT(*) - COUNT(col) AS nulls FROM (your_query) t
```
Sample inspection: Look at 5 rows — are values in expected ranges? Do string columns have meaningful values (not join keys)?
Fan-out check: If JOINing, compare COUNT(*) vs COUNT(DISTINCT primary_key):
```
SELECT COUNT(*) AS total_rows, COUNT(DISTINCT <pk>) AS unique_keys FROM (your_query) t;
```
If they differ, you have duplicate rows from a fan-out JOIN.
Re-read the question: Does your output actually answer what was asked?

5. Error Recovery Protocol

Syntax error: Use validate_sql before query_database to catch errors without burning a query turn
Wrong results: Do NOT just re-run the same query. Diagnose: which JOIN is wrong? Which filter is too aggressive?

Zero rows: Binary-search your WHERE conditions — remove them one at a time to find the culprit:

SELECT COUNT(*) FROM table WHERE cond_1;             -- still same? keep it
SELECT COUNT(*) FROM table WHERE cond_1 AND cond_2;  -- drops? cond_2 is culprit

Too many rows: Check for fan-out (duplicate join keys) or missing GROUP BY
CTE debugging: Use debug_cte_query to run each CTE independently and find which step breaks

6. Saving Output

Once you have the correct result:

Write final SQL to result.sql:

Write tool: path="result.sql", content="<your SQL query>"

Write the result as CSV to result.csv:
```
Write tool: path="result.csv", content="col1,col2,...\nval1,val2,..."
```
- Always include a header row with column names
- Use comma as delimiter
- Quote string values that contain commas or newlines

7. Turn Budget Management

First 3 turns: Schema exploration only (schema_overview, describe_table on 2-3 tables, explore_column on key categorical columns). STOP exploring.
Turns 4 through (N-3): Write query iteratively — execute and verify each step.
Last 3 turns: Finalize result.sql and result.csv. If you have a working query, SAVE IT NOW — do not keep iterating.

If your query works and passes all verification checks, SAVE IMMEDIATELY — do not continue exploring "just in case".

8. Common Benchmark Traps

Rounding: Do NOT round unless the question explicitly asks for rounded values. The evaluator uses tolerance-based comparison — full precision is always safer.
Column naming: Match the question's phrasing exactly. If the question says "total revenue", name the column total_revenue, not sum_revenue or revenue_total.
CSV format: No trailing newline, no BOM, comma delimiter, double-quote strings containing commas.
Empty result: If the correct answer is 0 or empty, write a CSV with just the header row (or header + "0").
Date/time format in CSV: Use ISO 8601 (YYYY-MM-DD) unless the question specifies otherwise.
String case in CSV: Preserve the case from the database — do not uppercase/lowercase unless the question explicitly asks.
Fan-out from JOINs: Always check COUNT(*) vs COUNT(DISTINCT key) after every JOIN
Wrong NULL handling: Use IS NULL / IS NOT NULL, not = NULL
Date format mismatch: Check the actual format stored in the column with explore_column
Case sensitivity: Use the correct case-insensitive function for your backend
Interpretation errors: Before saving, re-read the original question. Verify:
- Filter conditions match domain values (check with explore_column if unsure)
- "Excluding X" means the right thing (NOT IN vs EXCEPT vs WHERE NOT)
- Metrics match domain definitions (e.g., "scored points" in F1 = points > 0, not just participated)

sql-workflow

المزيد من هذا المستودع

المزيد من هذا المستودع

SQL Workflow Skill

1. Schema Exploration — Do This First

2. Output Shape Inference — Before Writing SQL

3. Iterative Query Building — Build Bottom-Up

4. Execution and Structured Verification

5. Error Recovery Protocol

6. Saving Output

7. Turn Budget Management

8. Common Benchmark Traps

SQL Workflow Skill

1. Schema Exploration — Do This First

2. Output Shape Inference — Before Writing SQL

3. Iterative Query Building — Build Bottom-Up

4. Execution and Structured Verification

5. Error Recovery Protocol

6. Saving Output

7. Turn Budget Management

8. Common Benchmark Traps