Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

$pwd:

discovering-gcp-data-assets

Name: Discovering Gcp Data Assets
Author: gemini-cli-extensions

// Finds and inspects data assets within Google Cloud. Relevant when any of the following conditions are true: 1. The user request involves finding, exploring, or inspecting data assets in Google Cloud, such as: - BigQuery datasets, tables, or views - BigLake catalog or tables - Spanner instances, databases or tables - etc. 2. You need to retrieve the schema, metadata, or governance policies for a GCP data asset. 3. You have a keyword or topic (e.g., "sales data") but lack the specific table or resource ID. 4. You are attempting to find data using `bq ls`, as this skill offers a superior approach. Don't use when: - Assets are outside Google Cloud

Exécuter dans Manus

$ git log --oneline --stat

stars:73

forks:10

updated:14 mai 2026 à 23:41

SKILL.md

readonly

related-skills.json

même dépôt

bigquery-data-transfer-service.md

from "gemini-cli-extensions/data-agent-kit-starter-pack"

Discovers and inspects BigQuery Data Transfer Service (DTS) configurations. Use this to identify existing ingestion pipelines and extract datasource or transfer config metadata for data pipelines. Use when a user asks for ingestion scenarios while building or managing data pipelines or when a user asks to "ingest" or "add" data that may already be managed by a DTS transfer.

2026-05-1473

dataform-bigquery.md

from "gemini-cli-extensions/data-agent-kit-starter-pack"

Expertise in generating clean, correct, and efficient Dataform pipeline code for BigQuery ELT. Use this when creating or modifying Dataform pipelines, actions, or source declarations, when Dataform, SQLX, or BigQuery are mentioned in a transformation, when data needs to be ingested from GCS into BigQuery via Dataform, or when setting up a new Dataform project or configuring workflow_settings.yaml.

2026-05-1473

dbt-bigquery.md

from "gemini-cli-extensions/data-agent-kit-starter-pack"

Expert guidance for creating, modifying, and optimizing dbt pipelines for BigQuery. Use this skill whenever user asks for generating or modifying a dbt model or project. Activate this skill when the user - Creates, modifies, or troubleshoots **dbt models or pipelines** - Needs to **optimize SQL** within a dbt project - Is **setting up a new dbt project** or configuring existing one

2026-05-1473

gcp-dataflow.md

from "gemini-cli-extensions/data-agent-kit-starter-pack"

Provides guidance for writing, packaging and executing Apache Beam pipelines on GCP using Cloud Dataflow. Use when: - Creating an Apache Beam Dataflow pipeline. - Creating a Google Flex Template.

2026-05-1473

gcp-pipeline-orchestration.md

from "gemini-cli-extensions/data-agent-kit-starter-pack"

This skill helps the agent generate or update orchestration pipeline definitions for Google Cloud Composer to initialize orchestration pipeline or update the orchestration definition for orchestration of various data pipelines, like dbt pipelines, notebooks, Spark jobs, Dataform, Python scripts or inline BigQuery SQL queries. This skill also helps deploy and trigger orchestration pipelines.

2026-05-1473

gcp-pipeline-resource-provisioning.md

from "gemini-cli-extensions/data-agent-kit-starter-pack"

Automates declarative resource creation and provisioning for data pipelines, supporting BigQuery, Dataform, Dataproc, BigQuery Data Transfer Service (DTS), and other resources. It manages environment-specific configurations (dev, staging, prod) through a deployment.yaml file. Use when: - Modifying or creating deployment.yaml for deployment settings. - Resolving environment-specific variables (e.g., Project IDs, Regions) for deployment. - Provisioning supported infrastructure like BigQuery datasets/tables, Dataform resources, or DTS resources via deployment.yaml. Do not use when: - Resources already exist. - Managing resources not supported by `gcloud beta orchestration-pipelines resource-types list`. - Managing general cloud infrastructure (VMs, networks, Kubernetes, IAM policies), which are better suited for Terraform. - Infrastructure spans multiple cloud providers (AWS, Azure, etc.). - Already uses Terraform for the target resources.

2026-05-1473

package.json

"author": "gemini-cli-extensions"

"repository": "gemini-cli-extensions/data-agent-kit-starter-pack"

Ouvrir le dépôt GitHub Voir les dépôts du créateur

$ install --global

$ download --local

Exécuter dans Manus

$ useful --forSOC

Scientifiques des donnéesProfessions informatiques et mathématiques15-2051L4

name	discovering-gcp-data-assets
description	Finds and inspects data assets within Google Cloud. Relevant when any of the following conditions are true: 1. The user request involves finding, exploring, or inspecting data assets in Google Cloud, such as: - BigQuery datasets, tables, or views - BigLake catalog or tables - Spanner instances, databases or tables - etc. 2. You need to retrieve the schema, metadata, or governance policies for a GCP data asset. 3. You have a keyword or topic (e.g., "sales data") but lack the specific table or resource ID. 4. You are attempting to find data using `bq ls`, as this skill offers a superior approach. Don't use when: - Assets are outside Google Cloud
license	Apache-2.0
metadata	{"version":"v4","publisher":"google"}

Instructions

Step 1: Prioritize Assets from the Conversation

If the asset was created or mentioned earlier in the same conversation, then proceed with that asset instead of searching. Skip steps 2, 3, and 4.

Step 2: Handle Public Datasets or Proceed to Search

Dataplex Lookup Context provides the richest metadata for data assets. You MUST prioritize using it for all Google Cloud assets, even if you already know their IDs.

Public Datasets (Direct Inspection): If the requested asset belongs to the bigquery-public-data project, Dataplex Lookup Context will fail. You MUST skip Steps 3 and 4 and inspect the table directly using the bq CLI or BigQuery MCP tools instead.
All Other Assets (Proceed to Step 3): For all other BigQuery, Cloud Storage, Spanner, BigLake Iceberg or general GCP data assets (whether their IDs are known or missing), you MUST proceed to Step 3 to search the Dataplex catalog and obtain their full Entry Name.

Step 3: Execute Discovery Search

You MUST use the Dataplex search command to discover assets and retrieve their full projects/... entry names. This step is required even if you already know the asset's short ID (e.g., my_dataset.my_table), because Step 4 strictly requires the full entry name.

[!IMPORTANT] The --project parameter MUST ALWAYS be provided. This project_id is used to attribute the search only and does NOT restrict the search scope. The project must have the dataplex API enabled and user must have the dataplex.entries.get permissions.

A. Semantic Search (Natural Language Intent)

Use this when the user describes the meaning or intent of the data (e.g., "Find Q4 product sales data").

Use the search_entries MCP tool

gcloud dataplex entries search "<NATURAL_LANGUAGE_QUERY>" \
  --project="<PROJECT_ID>" \
  --semantic-search \
  --limit=50

B. Keyword Search (Technical Strings)

Use this for exact keyword matches or technical strings (e.g., name:order_v2).

Search Query Rules (MANDATORY)

Mode-Specific Syntax:
- Semantic Search: Logical operators (AND, OR) MUST be UPPERCASE. Use plural labels. for label filters (e.g., labels.env=prod).
- Keyword Search: Operators are case-insensitive. Use singular label. for label filters (e.g., label.env=prod).
Abbreviated Logic: Use | for OR and , for AND within parentheses to shorten queries (e.g., projectid:(prod|staging) or column:(id,name)).
Exact vs. Token Match:
- Use : for token/substring matches (e.g., name:sales).
- Use = for exact matches. REQUIRED for system, type, and location.
Singular Keywords: When performing keyword search, ALWAYS convert plurals to singular (e.g., "product" NOT "products"). Semantic search handles singular/plural variations and synonyms automatically.
Scope Restriction: You SHOULD restrict the search scope using a parent filter if the project or dataset is known (e.g., parent:projects/<PROJECT_ID>).

Dataplex Search Syntax Reference

name:x: Substring/token match on resource ID.
displayname:x: Substring/token match on display name.
projectid:x: Substring/token match on GCP project ID.
parent:x: Substring match on hierarchical path (e.g., projects/my-proj).
location=x: Exact match on location (e.g., us-central1, us).
column:x: Substring/token match on column names in the schema.
system=x: Exact match on source system. Common values: bigquery, storage, biglake, cloud_sql, cloud_spanner, cloud_bigtable, pubsub.
type=x: Exact match on entry type (e.g., bigquery-table, storage-bucket, storage-folder).
labels.key=value: (Semantic Mode ONLY) Exact match on a label.
label.key=value: (Keyword Mode ONLY) Exact match on a label.
createtime[>|<|=]x: Match assets created after/before date YYYY-MM-DD.
fully_qualified_name=x: Exact match on the FQN (e.g., bigquery:project.dataset.table).

[!TIP] Dataplex search results rely on metadata being ingested into the Universal Catalog (often via Discovery Scans). If an asset is missing from search, it may not be indexed. - Fallback 1: Try searching by the fully_qualified_name qualifier. - Fallback 2: Use native tools (e.g., bq show, gcloud storage) or specific skills for that asset type if you already know the ID.

gcloud dataplex entries search "<KEYWORD_SEARCH_QUERY>" \
  --project="<PROJECT_ID>" \
  --limit=50

[!IMPORTANT] Handling Search Results and Avoiding Loops:

No Results: If the search returns no entries:

Variation Rule: You may try AT MOST 3 variations of the search query (e.g., switching AND/OR clauses, adding/removing parent:, removing projectid: or location:, trying fully_qualified_name=).

Stop Rule: If after 3 attempts no results are found, STOP and inform the user. Ask for clarification, specifically the Dataplex full entry name if known, or identifiers such as project ID, dataset ID, or instance ID to help narrow the search. Example: "I couldn't find any tables matching that description after several attempts. If you know the Dataplex full entry name (projects/...), please provide it. Otherwise, please provide any identifiers you know, such as project, dataset, or instance name, to help locate the asset."

Multiple Results:

If more than 10 results are returned, state that many matches were found. Show the names of the first 5 entries and ask for clarification.

If 2-10 results are returned and you cannot definitively choose, list them and ask the user.

Single Result: Proceed to Step 3 with the full entry name.

Avoid Infinite Loops: MUST NOT re-run identical or near-identical queries. If Dataplex fails to return the expected asset, prioritize asking the user for the exact resource ID or using Fallback 2 (Native Tools).

Criteria: Once candidate assets are returned, proceed to Step 4 using the full entry names from the search results.

Step 4: Lookup Context

You MUST use the Lookup Context command to fetch schema and deep metadata for the relevant results obtained from Step 3.

[!IMPORTANT] The --resources parameter MUST be the full name (starting with projects/) returned by the search result. Passing short table IDs, GCS URIs, or fully qualified bigquery: prefixes is PROHIBITED and will fail.

Command Execution

Use the lookup_context MCP tool

gcloud dataplex context lookup --resources="<FULL_ENTRY_NAME>"

Completion Criteria: The command returns the detailed schema and business context.

Troubleshooting

Context Lookup Fails or "Resource not found"

Cause: Short table names were used improperly.
Fix: Ensure you use the correct entry name format from the search results (starting with projects/).

Search Returns No Results

Cause: Plural terms in keyword search or lack of scoping.
Fix: Switch to singular keywords. For semantic search, try more descriptive natural language.

Context Lookup Fails with "NOT_FOUND" (despite correct format)

Cause: The table belongs to a project (e.g., bigquery-public-data) that has not fully synchronized its metadata with the Dataplex Universal Catalog. While the entry appears in search, context lookup is unavailable.
Fix: Fall back to direct inspection using native tools (e.g., bq CLI).
Stop Rule: If the native tool (e.g., bq show) also returns "Not Found", STOP. Do not restart the Dataplex discovery loop. Specifically ask the user to verify the project ID and table ID.

Breaking the Research Loop

If you find yourself repeatedly searching for the same asset:

STOP.
State what you have tried (e.g., "I tried searching Dataplex with X and Y, and checked bq show").
Ask the user for the exact project, dataset, and table ID.

Search Fails with "--project: Must be specified."

Cause: --project <PROJECT_ID> arguments were not provided
Fix: Provide a project which will be used to authorize and attribute the search request.

Search Fails with "PERMISSION_DENIED"

Cause: The project_id provided in the --project <PROJECT_ID> arguments does not have the Dataplex API enabled or the user is missing necessary IAM permissions.
Fix: Ask the user if they have a project which has the Dataplex API enabled with the dataplex.entries.get permission

discovering-gcp-data-assets

Plus depuis ce dépôt

Plus depuis ce dépôt

Instructions

Step 1: Prioritize Assets from the Conversation

Step 2: Handle Public Datasets or Proceed to Search

Step 3: Execute Discovery Search

A. Semantic Search (Natural Language Intent)

B. Keyword Search (Technical Strings)

Search Query Rules (MANDATORY)

Dataplex Search Syntax Reference

Step 4: Lookup Context

Command Execution

Troubleshooting

Context Lookup Fails or "Resource not found"

Search Returns No Results

Context Lookup Fails with "NOT_FOUND" (despite correct format)

Breaking the Research Loop

Search Fails with "--project: Must be specified."

Search Fails with "PERMISSION_DENIED"

Instructions

Step 1: Prioritize Assets from the Conversation

Step 2: Handle Public Datasets or Proceed to Search

Step 3: Execute Discovery Search

A. Semantic Search (Natural Language Intent)

B. Keyword Search (Technical Strings)

Search Query Rules (MANDATORY)

Dataplex Search Syntax Reference

Step 4: Lookup Context

Command Execution

Troubleshooting

Context Lookup Fails or "Resource not found"

Search Returns No Results

Context Lookup Fails with "NOT_FOUND" (despite correct format)

Breaking the Research Loop

Search Fails with "--project: Must be specified."

Search Fails with "PERMISSION_DENIED"