con un clic
pyspark-databricks
Build and optimize PySpark pipelines on Databricks.
Instalar con Codex o Claude Copia este prompt, pégalo en Codex, Claude u otro asistente, y deja que revise la página de la skill y la instale por ti.
Menú
Build and optimize PySpark pipelines on Databricks.
Instalar con Codex o Claude Copia este prompt, pégalo en Codex, Claude u otro asistente, y deja que revise la página de la skill y la instale por ti.
Basado en la clasificación ocupacional SOC
Use when implementing code changes, bug fixes, refactors, or multi-step edits that may sprawl; keeps work split into atomic, independently verifiable changes.
Helm chart development patterns for packaging and deploying Kubernetes applications. Use when creating reusable Helm charts, managing multi-environment deployments, or building application catalogs for Kubernetes.
Generate commit messages following conventional commit format.
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
Build GitLab CI/CD pipelines with multi-stage workflows, caching, and distributed runners for scalable automation. Use when implementing GitLab CI/CD, optimizing pipeline performance, or setting up automated testing and deployment.
Use when building Go-based Kubernetes agents/controllers, reconcile loops, or cloud-native systems. Invoke for controller-runtime, CRDs, leader election, and Go concurrency.
| name | pyspark-databricks |
| description | Build and optimize PySpark pipelines on Databricks. |
Use when you need help authoring or tuning PySpark on Databricks. Ask clarifying questions about data volume, schema, and SLAs.
collect on large datafrom pyspark.sql import functions as F
events = spark.read.format("parquet").load("/mnt/raw/events/")
users = spark.read.option("header", "true").csv("/mnt/raw/users.csv")
result = (
events.join(users, "user_id")
.groupBy("country")
.agg(F.count("*").alias("events"))
)
(result.write.format("delta")
.mode("overwrite")
.partitionBy("country")
.save("/mnt/delta/event_counts"))
## PySpark Pipeline Update
### Summary
- [What changed]
- [Optimization or reliability gain]
### Code
```python
[PySpark code]