com um clique
pyspark-databricks
Build and optimize PySpark pipelines on Databricks.
Instalar com Codex ou Claude Copie este prompt, cole no Codex, Claude ou outro assistente e deixe que ele revise a página da skill e instale para você.
Menu
Build and optimize PySpark pipelines on Databricks.
Instalar com Codex ou Claude Copie este prompt, cole no Codex, Claude ou outro assistente e deixe que ele revise a página da skill e instale para você.
Baseado na classificação ocupacional SOC
Use when implementing code changes, bug fixes, refactors, or multi-step edits that may sprawl; keeps work split into atomic, independently verifiable changes.
Helm chart development patterns for packaging and deploying Kubernetes applications. Use when creating reusable Helm charts, managing multi-environment deployments, or building application catalogs for Kubernetes.
Generate commit messages following conventional commit format.
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
Build GitLab CI/CD pipelines with multi-stage workflows, caching, and distributed runners for scalable automation. Use when implementing GitLab CI/CD, optimizing pipeline performance, or setting up automated testing and deployment.
Use when building Go-based Kubernetes agents/controllers, reconcile loops, or cloud-native systems. Invoke for controller-runtime, CRDs, leader election, and Go concurrency.
| name | pyspark-databricks |
| description | Build and optimize PySpark pipelines on Databricks. |
Use when you need help authoring or tuning PySpark on Databricks. Ask clarifying questions about data volume, schema, and SLAs.
collect on large datafrom pyspark.sql import functions as F
events = spark.read.format("parquet").load("/mnt/raw/events/")
users = spark.read.option("header", "true").csv("/mnt/raw/users.csv")
result = (
events.join(users, "user_id")
.groupBy("country")
.agg(F.count("*").alias("events"))
)
(result.write.format("delta")
.mode("overwrite")
.partitionBy("country")
.save("/mnt/delta/event_counts"))
## PySpark Pipeline Update
### Summary
- [What changed]
- [Optimization or reliability gain]
### Code
```python
[PySpark code]