원클릭으로
pipeline-design
Design data pipelines — ETL vs ELT, orchestration, batch vs streaming, idempotency, data quality, lineage
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
메뉴
Design data pipelines — ETL vs ELT, orchestration, batch vs streaming, idempotency, data quality, lineage
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
SOC 직업 분류 기준
Local git operations for syncing, branching, merging, and conflict resolution
GitHub interactions for issues, PRs, releases, and repository management
Use this skill when performing hardware security analysis for System-on-Chip components — threat modeling, verification scaffolding, compliance mapping, executive briefing, microarchitectural attack analysis, physical side-channel assessment, kernel security analysis, emerging hardware security, or TLA+ formal specification. Routes to the appropriate specialist. Trigger phrases include "threat model my SoC", "run STRIDE analysis", "generate SVA assertions", "compliance check against FIPS", "executive summary of findings", "Spectre analysis for cache", "DPA attack assessment", "kernel hardening review", "PQC hardware review", "TLA+ spec for access control". Do NOT use for software-only security, network security, or web application security.
Use when working with Terraform or OpenTofu - creating modules, writing tests (native test framework, Terratest), setting up CI/CD pipelines, reviewing configurations, choosing between testing approaches, debugging state issues, implementing security scanning (trivy, checkov), or making infrastructure-as-code architecture decisions
Security audit checklist for web applications. Use when reviewing, auditing, or hardening a web app's security posture. Covers rate limiting, auth headers, IP blocking, CORS, security middleware, input validation, file upload limits, ORM usage, and password hashing. Triggers on requests like "review security", "harden this app", "security audit", "check for vulnerabilities", or when building/reviewing API endpoints.
Use this skill when connecting AI or LLMs to data platforms. Covers MCP servers for warehouses, natural-language-to-SQL, embeddings for data discovery, LLM-powered enrichment, and AI agent data access patterns. Common phrases: "text-to-SQL", "MCP server for Snowflake", "LLM data enrichment", "AI agent access". Do NOT use for general data integration (use data-integration) or dbt modeling (use dbt-transforms).
| name | Pipeline Design |
| department | alchemist |
| description | Design data pipelines — ETL vs ELT, orchestration, batch vs streaming, idempotency, data quality, lineage |
| version | 1 |
| triggers | ["pipeline","ETL","ELT","orchestration","Airflow","Dagster","Prefect","dbt","batch","streaming","Kafka","ingestion","transformation","lineage","data quality"] |
Design data pipelines that reliably move, transform, and deliver data from source systems to consumption layers. Covers ETL vs ELT pattern selection, orchestration tool choice, batch vs streaming trade-offs, idempotency guarantees, data quality checkpoints, and lineage tracking.
Document every data flow:
Produce a flow diagram showing all sources, transformations, and destinations.
Evaluate the trade-offs:
Document the chosen pattern per flow and the reasoning.
For each flow, determine the processing mode:
Document latency requirements, cost implications, and complexity trade-offs for the chosen mode.
Ensure every pipeline step is safe to re-run:
Insert quality gates between pipeline stages:
For each checkpoint, define: what is checked, what threshold triggers a failure, and what happens on failure (halt pipeline, alert, quarantine bad records).
Design data lineage tracking:
Specify tooling: dbt lineage, OpenLineage, Datahub, or custom metadata tables.
Choose the orchestration layer based on team and requirements:
Document the choice, alternatives considered, and migration path if the team outgrows the tool.
# Pipeline Design: [Project/Domain Name]
## Flow Diagram
[ASCII diagram showing sources → transformations → destinations]
## Flow Inventory
| Flow | Source | Extraction | Transform | Load Pattern | Volume | Freshness SLA |
|------|--------|-----------|-----------|-------------|--------|---------------|
| ... | ... | ... | ... | ... | ... | ... |
## Architecture Decisions
| Decision | Chosen | Alternatives | Rationale |
|----------|--------|-------------|-----------|
| ETL vs ELT | ... | ... | ... |
| Batch vs Streaming | ... | ... | ... |
| Orchestration tool | ... | ... | ... |
## Idempotency Strategy
| Pipeline Step | Idempotency Method | Re-run Behavior |
|---------------|-------------------|-----------------|
| ... | ... | ... |
## Data Quality Checkpoints
| Stage | Check | Threshold | On Failure |
|-------|-------|-----------|------------|
| Source | ... | ... | ... |
| Transform | ... | ... | ... |
| Destination | ... | ... | ... |
## Lineage and Observability
| Capability | Tool/Method | Coverage |
|-----------|-------------|----------|
| Column lineage | ... | ... |
| Pipeline metrics | ... | ... |
| Alerting | ... | ... |
## Orchestration Design
| DAG/Pipeline | Schedule | Dependencies | SLA |
|-------------|----------|-------------|-----|
| ... | ... | ... | ... |