一键导入
data-pipeline
Build robust data pipelines for ETL, streaming, and batch processing. Orchestrate data movement between sources and destinations.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Build robust data pipelines for ETL, streaming, and batch processing. Orchestrate data movement between sources and destinations.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Interactive onboarding workflow that interviews users to understand their coding goals and generates PR-ready implementation plans. Use when starting a new development task to ensure clear requirements and structured execution.
Implement security best practices for Gamma integration. Use when securing API keys, implementing access controls, or auditing Gamma security configuration. Trigger with phrases like "gamma security", "gamma API key security", "gamma secure", "gamma credentials", "gamma access control".
Write effective technical documentation including READMEs, API docs, architecture decisions, and inline code documentation.
Build and manage CI/CD pipelines with Azure DevOps. Configure builds, releases, and automate software delivery workflows.
Develop, deploy, and manage Azure Functions for serverless computing. Supports HTTP triggers, timers, queues, and event-driven architectures.
Manage Azure resources effectively using CLI, Portal, Bicep, and ARM templates. Use for provisioning, organizing, and maintaining cloud infrastructure.
| name | data-pipeline |
| description | Build robust data pipelines for ETL, streaming, and batch processing. Orchestrate data movement between sources and destinations. |
| triggers | ["/data pipeline","/etl pipeline"] |
This skill guides you through building robust data pipelines for extracting, transforming, and loading data across various sources and destinations.
Use this skill when you need to:
Components
Data Source → Extract → Transform → Load → Data Destination
↓ ↓ ↓
Validation Enrichment Monitoring
Pipeline Types
Idempotency
Fault Tolerance
Scalability
Apache Airflow Example
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime, timedelta
def extract_data(**context):
# Extract from source
data = source_api.fetch(date=context['ds'])
return data
def transform_data(**context):
# Transform data
ti = context['ti']
data = ti.xcom_pull(task_ids='extract')
transformed = clean_and_normalize(data)
return transformed
def load_data(**context):
# Load to destination
ti = context['ti']
data = ti.xcom_pull(task_ids='transform')
destination_db.insert(data)
with DAG('data_pipeline', start_date=datetime(2024, 1, 1), schedule_interval='@daily') as dag:
extract = PythonOperator(task_id='extract', python_callable=extract_data)
transform = PythonOperator(task_id='transform', python_callable=transform_data)
load = PythonOperator(task_id='load', python_callable=load_data)
extract >> transform >> load
Validation Checks
Monitoring
Code Organization
pipelines/
├── dags/ # Airflow DAG definitions
├── tasks/ # Reusable task implementations
├── utils/ # Helper functions
├── tests/ # Unit and integration tests
└── config/ # Environment configurations
Configuration Management
See the examples/ directory for:
batch-pipeline.py - Daily batch ETL workflowstreaming-pipeline.py - Kafka to database streamingdata-quality-checks.py - Great Expectations integrationerror-handling.py - Robust error handling patterns