Run any Skill in Manus with one click

$pwd:

bigquery-sql

Name: Bigquery Sql
Author: SignalPilot-Labs

// BigQuery-specific SQL patterns: UNNEST for array expansion, STRUCT, ARRAY_AGG, DATE_DIFF/DATE_ADD, backtick-quoted table references, EXCEPT/REPLACE in SELECT, approximate aggregation, partitioned and wildcard tables.

Run Skill in Manus

$ git log --oneline --stat

stars:462

forks:22

updated:April 14, 2026 at 07:38

SKILL.md

readonly

name	bigquery-sql
description	BigQuery-specific SQL patterns: UNNEST for array expansion, STRUCT, ARRAY_AGG, DATE_DIFF/DATE_ADD, backtick-quoted table references, EXCEPT/REPLACE in SELECT, approximate aggregation, partitioned and wildcard tables.
type	skill

BigQuery SQL Skill

1. Table References — Always Backtick-Quote

-- Full table reference
SELECT * FROM `project.dataset.table`;

-- Can omit project if using the default project
SELECT * FROM `dataset.table`;

2. Array Expansion — Use UNNEST

-- Explode an array column to rows
SELECT id, item
FROM `project.dataset.table`,
UNNEST(array_col) AS item;

-- UNNEST with offset (position)
SELECT id, item, pos
FROM `project.dataset.table`,
UNNEST(array_col) AS item WITH OFFSET AS pos;

-- UNNEST a literal array
SELECT * FROM UNNEST([1, 2, 3]) AS num;

3. Date Functions

-- Add/subtract time
DATE_ADD(order_date, INTERVAL 7 DAY)
DATE_ADD(CURRENT_DATE(), INTERVAL -1 MONTH)

-- Difference between dates
DATE_DIFF(end_date, start_date, DAY)
DATE_DIFF(end_date, start_date, MONTH)

-- Truncate to period
DATE_TRUNC(event_date, MONTH)
TIMESTAMP_TRUNC(event_ts, HOUR)

-- Current date/time
CURRENT_DATE()
CURRENT_TIMESTAMP()

4. SELECT EXCEPT and REPLACE

-- All columns except one
SELECT * EXCEPT (col_to_remove) FROM `dataset.table`;

-- Replace a column value inline
SELECT * REPLACE (UPPER(name) AS name) FROM `dataset.table`;

5. STRUCT and ARRAY_AGG

-- Create a STRUCT
SELECT STRUCT(id, name) AS person FROM `dataset.table`;

-- Aggregate rows into an array
SELECT department, ARRAY_AGG(employee_name) AS employees
FROM `dataset.employees`
GROUP BY department;

-- Aggregate into array of structs
SELECT ARRAY_AGG(STRUCT(id, name)) AS records FROM `dataset.table`;

6. Approximate Aggregation (for large tables)

-- Approximate distinct count (faster for large tables)
APPROX_COUNT_DISTINCT(user_id)

-- Approximate quantiles
APPROX_QUANTILES(value, 100)[OFFSET(50)]  -- median

7. Partitioned Tables

When querying partitioned tables, always filter on the partition column to avoid full-table scans:

-- Partition on _PARTITIONDATE (pseudo-column)
WHERE _PARTITIONDATE >= '2024-01-01'

-- Partition on a date column
WHERE event_date BETWEEN '2024-01-01' AND '2024-12-31'

8. Wildcard Tables (date-sharded)

-- Query all date-sharded tables matching a prefix
SELECT * FROM `project.dataset.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20241231';

9. String Functions

REGEXP_EXTRACT(col, r'pattern')          -- extract first match
REGEXP_REPLACE(col, r'pattern', 'repl')  -- replace matches
SPLIT(col, ',')[SAFE_OFFSET(0)]          -- split, access by index
TRIM(col) / LTRIM(col) / RTRIM(col)
FORMAT('%s-%d', str_col, int_col)        -- printf-style formatting

10. Common Anti-Patterns to Avoid

Do NOT use = NULL — use IS NULL
Do NOT forget to filter partitioned tables — costs money
Do NOT use COUNT(DISTINCT ...) on huge tables — use APPROX_COUNT_DISTINCT
Always backtick-quote table names with dots in them

11. Benchmark Patterns

STRING_AGG: Use STRING_AGG(col, ',' ORDER BY col) for string aggregation (not GROUP_CONCAT).
SAFE_DIVIDE / SAFE_CAST: Use to avoid division-by-zero errors and cast failures.
IF / IIF: BigQuery supports IF(condition, true_val, false_val) — often cleaner than CASE WHEN for simple conditions.
GENERATE_DATE_ARRAY / GENERATE_TIMESTAMP_ARRAY: For date spine generation.
Numeric precision: BigQuery's FLOAT64 can lose precision. Use NUMERIC type or ROUND() only when the question asks for it.
INFORMATION_SCHEMA: SELECT * FROM dataset.INFORMATION_SCHEMA.COLUMNS for metadata queries — useful when schema_overview is insufficient.

12. Spider2 BigQuery Patterns

Default project: spider2-public-data. Table references: spider2-public-data.{dataset}.{table}
StackOverflow tags: Stored as pipe-delimited strings in tags column (e.g., |python|python-2.7|). To filter for Python 2 specific questions (excluding Python 3):
```
WHERE REGEXP_CONTAINS(tags, r'python-2') AND NOT REGEXP_CONTAINS(tags, r'python-3')
```
Date columns: Many BQ tables store dates as TIMESTAMP or DATE. Always check the actual type with describe_table.
Large tables: Use partition filters and LIMIT during exploration. Avoid SELECT * on tables with >1M rows.

related-skills.json

same repository

notion-context.md

from "SignalPilot-Labs/SignalPilot"

Gather business context from Notion before dbt builds. Searches pages, extracts definitions/decisions/constraints, writes structured context for the build agent and notion-verify subagent.

2026-05-19462

dbt-workflow.md

from "SignalPilot-Labs/SignalPilot"

Load at Step 1 before exploring the project. Covers Notion context gathering, output shape inference, incremental model handling, and what to trust in YML.

2026-05-16462

dbt-debugging.md

from "SignalPilot-Labs/SignalPilot"

Load when dbt run or dbt parse fails. Covers YML duplicate patches, ref errors, passthrough model warnings, current_date fixes, DuckDB error messages, and zero-row diagnosis.

2026-04-20462

dbt-write.md

from "SignalPilot-Labs/SignalPilot"

Load at Step 4 when writing SQL models. Covers column naming, type preservation, JOIN defaults, lookup joins, sibling models, materialization, packages, and filtering rules.

2026-04-20462

duckdb-sql.md

from "SignalPilot-Labs/SignalPilot"

Load when hitting DuckDB syntax errors or writing DuckDB-specific SQL. Covers gotchas that differ from PostgreSQL/MySQL.

2026-04-20462

sql-workflow.md

from "SignalPilot-Labs/SignalPilot"

Use this skill before writing any SQL query. Covers: output shape inference (cardinality clues from the question), efficient schema exploration, iterative CTE-based query building, structured verification loop (row count, NULL audit, fan-out check, sample inspection), error recovery protocol, saving output to result.sql and result.csv, turn budget management, and common benchmark traps.

2026-04-14462

package.json

"author": "SignalPilot-Labs"

"repository": "SignalPilot-Labs/SignalPilot"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	bigquery-sql
description	BigQuery-specific SQL patterns: UNNEST for array expansion, STRUCT, ARRAY_AGG, DATE_DIFF/DATE_ADD, backtick-quoted table references, EXCEPT/REPLACE in SELECT, approximate aggregation, partitioned and wildcard tables.
type	skill

BigQuery SQL Skill

1. Table References — Always Backtick-Quote

-- Full table reference
SELECT * FROM `project.dataset.table`;

-- Can omit project if using the default project
SELECT * FROM `dataset.table`;

2. Array Expansion — Use UNNEST

-- Explode an array column to rows
SELECT id, item
FROM `project.dataset.table`,
UNNEST(array_col) AS item;

-- UNNEST with offset (position)
SELECT id, item, pos
FROM `project.dataset.table`,
UNNEST(array_col) AS item WITH OFFSET AS pos;

-- UNNEST a literal array
SELECT * FROM UNNEST([1, 2, 3]) AS num;

3. Date Functions

-- Add/subtract time
DATE_ADD(order_date, INTERVAL 7 DAY)
DATE_ADD(CURRENT_DATE(), INTERVAL -1 MONTH)

-- Difference between dates
DATE_DIFF(end_date, start_date, DAY)
DATE_DIFF(end_date, start_date, MONTH)

-- Truncate to period
DATE_TRUNC(event_date, MONTH)
TIMESTAMP_TRUNC(event_ts, HOUR)

-- Current date/time
CURRENT_DATE()
CURRENT_TIMESTAMP()

4. SELECT EXCEPT and REPLACE

-- All columns except one
SELECT * EXCEPT (col_to_remove) FROM `dataset.table`;

-- Replace a column value inline
SELECT * REPLACE (UPPER(name) AS name) FROM `dataset.table`;

5. STRUCT and ARRAY_AGG

-- Create a STRUCT
SELECT STRUCT(id, name) AS person FROM `dataset.table`;

-- Aggregate rows into an array
SELECT department, ARRAY_AGG(employee_name) AS employees
FROM `dataset.employees`
GROUP BY department;

-- Aggregate into array of structs
SELECT ARRAY_AGG(STRUCT(id, name)) AS records FROM `dataset.table`;

6. Approximate Aggregation (for large tables)

-- Approximate distinct count (faster for large tables)
APPROX_COUNT_DISTINCT(user_id)

-- Approximate quantiles
APPROX_QUANTILES(value, 100)[OFFSET(50)]  -- median

7. Partitioned Tables

When querying partitioned tables, always filter on the partition column to avoid full-table scans:

-- Partition on _PARTITIONDATE (pseudo-column)
WHERE _PARTITIONDATE >= '2024-01-01'

-- Partition on a date column
WHERE event_date BETWEEN '2024-01-01' AND '2024-12-31'

8. Wildcard Tables (date-sharded)

-- Query all date-sharded tables matching a prefix
SELECT * FROM `project.dataset.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20241231';

9. String Functions

REGEXP_EXTRACT(col, r'pattern')          -- extract first match
REGEXP_REPLACE(col, r'pattern', 'repl')  -- replace matches
SPLIT(col, ',')[SAFE_OFFSET(0)]          -- split, access by index
TRIM(col) / LTRIM(col) / RTRIM(col)
FORMAT('%s-%d', str_col, int_col)        -- printf-style formatting

10. Common Anti-Patterns to Avoid

Do NOT use = NULL — use IS NULL
Do NOT forget to filter partitioned tables — costs money
Do NOT use COUNT(DISTINCT ...) on huge tables — use APPROX_COUNT_DISTINCT
Always backtick-quote table names with dots in them

11. Benchmark Patterns

STRING_AGG: Use STRING_AGG(col, ',' ORDER BY col) for string aggregation (not GROUP_CONCAT).
SAFE_DIVIDE / SAFE_CAST: Use to avoid division-by-zero errors and cast failures.
IF / IIF: BigQuery supports IF(condition, true_val, false_val) — often cleaner than CASE WHEN for simple conditions.
GENERATE_DATE_ARRAY / GENERATE_TIMESTAMP_ARRAY: For date spine generation.
Numeric precision: BigQuery's FLOAT64 can lose precision. Use NUMERIC type or ROUND() only when the question asks for it.
INFORMATION_SCHEMA: SELECT * FROM dataset.INFORMATION_SCHEMA.COLUMNS for metadata queries — useful when schema_overview is insufficient.

12. Spider2 BigQuery Patterns

Default project: spider2-public-data. Table references: spider2-public-data.{dataset}.{table}
StackOverflow tags: Stored as pipe-delimited strings in tags column (e.g., |python|python-2.7|). To filter for Python 2 specific questions (excluding Python 3):
```
WHERE REGEXP_CONTAINS(tags, r'python-2') AND NOT REGEXP_CONTAINS(tags, r'python-3')
```
Date columns: Many BQ tables store dates as TIMESTAMP or DATE. Always check the actual type with describe_table.
Large tables: Use partition filters and LIMIT during exploration. Avoid SELECT * on tables with >1M rows.

bigquery-sql

BigQuery SQL Skill

1. Table References — Always Backtick-Quote

2. Array Expansion — Use UNNEST

3. Date Functions

4. SELECT EXCEPT and REPLACE

5. STRUCT and ARRAY_AGG

6. Approximate Aggregation (for large tables)

7. Partitioned Tables

8. Wildcard Tables (date-sharded)

9. String Functions

10. Common Anti-Patterns to Avoid

11. Benchmark Patterns

12. Spider2 BigQuery Patterns

More from this repository

More from this repository

BigQuery SQL Skill

1. Table References — Always Backtick-Quote

2. Array Expansion — Use UNNEST

3. Date Functions

4. SELECT EXCEPT and REPLACE

5. STRUCT and ARRAY_AGG

6. Approximate Aggregation (for large tables)

7. Partitioned Tables

8. Wildcard Tables (date-sharded)

9. String Functions

10. Common Anti-Patterns to Avoid

11. Benchmark Patterns

12. Spider2 BigQuery Patterns