Run any Skill in Manus with one click

bigquery

Stars1,696

Forks172

UpdatedJune 15, 2026 at 20:40

Query the configured BigQuery warehouse for analytics data. Use when the user asks for warehouse SQL, BigQuery tables, Amplitude-in-BigQuery events, or a metric/table that the data dictionary says lives in BigQuery.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

BuilderIO

BuilderIO/agent-native

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software DevelopersComputer and Mathematical Occupations·SOC 15-1252

SKILL.md

readonly

BigQuery

CRITICAL: BigQuery is a Native Agent Tool

bigquery is available directly in the agent's tool list as a native callable tool.

If you see bigquery in your available tools — call it directly with your SQL. Do not use HTTP workarounds, web-request hacks, or scripts as a substitute.
The server/lib/bigquery.ts description below is the underlying implementation. It does not mean BigQuery is only accessible via terminal commands or scripts.
When uncertain if the tool works, call it — don't reason your way to "it won't work". Empirically test by calling the tool.

Data-Dictionary-First Discipline

Before writing any SQL, verify the metric definition and exact table/column names.

Check the injected <data-dictionary> block and call list-data-dictionary first.
Use search-bigquery-schema to confirm exact dataset, table, and column names.
Only write SQL after you know the correct table and columns. Do not guess.

Why this matters: column names commonly differ from spec (see "Column Name Differences" below). Querying the wrong column returns wrong numbers silently. Confirming with search-bigquery-schema or INFORMATION_SCHEMA.COLUMNS before writing the query is always faster than debugging a wrong result.

Source Of Truth Order

Before writing SQL, use the highest-confidence source available:

The injected <data-dictionary> block and list-data-dictionary.
Existing dashboard SQL or saved analyses that already answer the same metric.
search-bigquery-schema metadata for exact datasets, tables, and columns.
A concise user clarification when the business meaning cannot be inferred.

Do not invent dataset, table, or column names. BigQuery is a warehouse, not one table; BIGQUERY_PROJECT_ID is only the default project.

Table Priority Order (CRITICAL)

Always prefer dbt tables before raw source tables:

dbt_mart.* — first choice for business-level data (deals, contracts, subscriptions, customers, companies).
dbt_analytics.* — first choice for reporting views and aggregated metrics.
dbt_staging_bigquery.* — for raw staged data (pageviews, signups) when dbt_mart/analytics don't have it.
dbt_intermediate.* — for joins/transforms when no mart or analytics table exists.
analytics.* / amplitude.* / raw tables — last resort only, when no dbt equivalent exists.

Never query raw source tables if a dbt model covers the same data. dbt models are deduplicated, tested, and have canonical column names. Raw tables may have duplicates, schema drift, and inconsistent naming.

Avoid dbt_dev.* — development schema, excluded globally.

Always Bound Queries by Date

Always include a date or timestamp filter in any query that touches large event or pageview tables. Unbounded queries scan full table history, hit byte limits, and return unusable result sets. Use the most restrictive date range that still answers the question.

-- Always add: WHERE event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
-- Or for TIMESTAMP columns: WHERE event_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)

Actions

Action	Use
`data-source-status --key bigquery`	Check whether BigQuery credentials and project are configured.
`list-data-dictionary --search <topic>`	Find canonical metric/table definitions before SQL.
`search-bigquery-schema`	List datasets, list tables in a dataset, or describe table columns.
`bigquery --sql "<sql>"`	Run a real warehouse query after table/column names are known.

For direct Amplitude/PostHog/Mixpanel product-event data (outside the warehouse copy), there is no first-class action — call provider-api-catalog / provider-api-docs then provider-api-request (provider amplitude, posthog, or mixpanel), and aggregate in run-code for corpus-wide event analysis.

Schema Discovery

Use search-bigquery-schema instead of asking the user to manually print field lists:

pnpm action search-bigquery-schema
pnpm action search-bigquery-schema --dataset=product_events
pnpm action search-bigquery-schema --table=product_events.events
pnpm action search-bigquery-schema --dataset=product_events --search=rootOrganizationId

When describing a table, copy exact field names from the action result. If a field has a nested path such as event_properties.rootOrganizationId, use the proper BigQuery JSON or STRUCT access pattern for that field type.

Dictionary Trust

Approved dictionary entries are canonical. Human-authored but unreviewed entries are usable with light verification. AI-generated unapproved entries are suggestions only: verify the table and columns with search-bigquery-schema and prefer saving a reviewed dictionary update once the meaning is clear.

Common Patterns

Use @project.dataset.table when you want the configured project placeholder.
Use @app_events only for the optional app-events table; it is not the whole warehouse and it is not a connection-status signal.
For metrics or dashboard panels, run at least one real data query before presenting numbers.
An unknown table or column error is a normal, recoverable signal — not a stopping point. Use search-bigquery-schema (or INFORMATION_SCHEMA) to get the exact datasets, tables, and columns, correct the query based on the error, and run it again. Iterate until it succeeds or you have made a few corrective attempts; only surface to the user if it still fails or the error is non-recoverable (missing credentials, permission, quota).

Column Name Differences (Common Bug Sources)

Spec Column	Actual Column	Table
`first_pageview_date`	`created_date` (TIMESTAMP)	first_pageviews
`channel` (pageviews)	`first_touch_channel`	all_pageviews
`referrer`	`c_referrer`	all_pageviews
`referrer_channel`	`session_channel`	all_pageviews
`user_create_date`	`user_create_d`	product_signups
`deal_stage`	`stage_name`	dim_deals
`deal_amount`	`amount`	dim_deals

Always verify with search-bigquery-schema before writing SQL against an unfamiliar table.

Join Paths — Identity Stitching Rule

Always match on BOTH a stable id AND email when joining contacts/users across tables:

-- CORRECT: both user_id and email
ON signups.user_id = contacts.user_id
AND LOWER(signups.email) = LOWER(contacts.email)

-- WRONG: id alone — IDs can be reassigned or recycled
ON signups.user_id = contacts.user_id

IDs can be reassigned after deletes/merges. Email alone over-matches shared addresses. Require both for exact matches; flag email-only or id-only joins as low-confidence caveats.

SQL Patterns

Timestamps vs Dates

TIMESTAMP columns need TIMESTAMP('2025-11-01') comparisons, not date strings.
DATE columns need DATE('2025-11-01').
Use DATE(timestamp_col) before DATE_TRUNC to avoid type mismatch errors.
Use QUALIFY ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) = 1 for deduplication — cleaner than a subquery.

Type Casting

Some warehouse tables store booleans as STRING 'true'/'false' — always CAST(col AS STRING) = 'true' rather than treating as BOOL.
Amount/ARR fields may be stored as STRING — CAST(amount AS FLOAT64) before arithmetic.

Avoid Double WHERE

-- WRONG: syntax error
WHERE date BETWEEN '...' AND '...'
WHERE col IS NOT NULL

-- CORRECT: single WHERE with AND
WHERE col IS NOT NULL AND date BETWEEN '...' AND '...'

name	bigquery
description	Query the configured BigQuery warehouse for analytics data. Use when the user asks for warehouse SQL, BigQuery tables, Amplitude-in-BigQuery events, or a metric/table that the data dictionary says lives in BigQuery.

bigquery

More from this repository

BigQuery

CRITICAL: BigQuery is a Native Agent Tool

Data-Dictionary-First Discipline

Source Of Truth Order

Table Priority Order (CRITICAL)

Always Bound Queries by Date

Actions

Schema Discovery

Dictionary Trust

Common Patterns

Column Name Differences (Common Bug Sources)

Join Paths — Identity Stitching Rule

SQL Patterns

Timestamps vs Dates

Type Casting

Avoid Double WHERE

BigQuery

CRITICAL: BigQuery is a Native Agent Tool

Data-Dictionary-First Discipline

Source Of Truth Order

Table Priority Order (CRITICAL)

Always Bound Queries by Date

Actions

Schema Discovery

Dictionary Trust

Common Patterns

Column Name Differences (Common Bug Sources)

Join Paths — Identity Stitching Rule

SQL Patterns

Timestamps vs Dates

Type Casting

Avoid Double WHERE

More from this repository