ワンクリックでManusで任意のスキルを実行

始める

pyspark-databricks

スター9

フォーク1

更新日2026年2月12日 11:59

Build and optimize PySpark pipelines on Databricks.

インストール

Codex または Claude でインストールこの Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。

Manusで実行

ソース

Awish021

Awish021/opencode

GitHub リポジトリを開く Creator のリポジトリを見る

ダウンロード

Manusで実行

PySpark Databricks

What I do

Build PySpark ETL pipelines on Databricks
Optimize Spark jobs for performance and cost
Apply Delta Lake patterns for reliability

When to use me

Use when you need help authoring or tuning PySpark on Databricks. Ask clarifying questions about data volume, schema, and SLAs.

Quick checklist

Avoid collect on large data
Partition and prune data sources
Cache judiciously for reuse
Use Delta Lake tables
Prefer Spark SQL/DataFrame API

Minimal examples

from pyspark.sql import functions as F

events = spark.read.format("parquet").load("/mnt/raw/events/")
users = spark.read.option("header", "true").csv("/mnt/raw/users.csv")

result = (
    events.join(users, "user_id")
    .groupBy("country")
    .agg(F.count("*").alias("events"))
)

(result.write.format("delta")
 .mode("overwrite")
 .partitionBy("country")
 .save("/mnt/delta/event_counts"))

Output format

## PySpark Pipeline Update

### Summary
- [What changed]
- [Optimization or reliability gain]

### Code
```python
[PySpark code]

Notes

[Assumptions]
[Next steps]

このリポジトリの他の Skills

同じリポジトリ

atomic-code-changes

Awish021/opencode

Use when implementing code changes, bug fixes, refactors, or multi-step edits that may sprawl; keeps work split into atomic, independently verifiable changes.

2026-06-199

helm-chart-patterns

Awish021/opencode

Helm chart development patterns for packaging and deploying Kubernetes applications. Use when creating reusable Helm charts, managing multi-environment deployments, or building application catalogs for Kubernetes.

2026-02-129

conventional-commits

Awish021/opencode

Generate commit messages following conventional commit format.

2026-02-129

find-skills

Awish021/opencode

Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.

2026-02-129

gitlab-ci-patterns

Awish021/opencode

Build GitLab CI/CD pipelines with multi-stage workflows, caching, and distributed runners for scalable automation. Use when implementing GitLab CI/CD, optimizing pipeline performance, or setting up automated testing and deployment.

2026-02-129

golang-k8s-agent

Awish021/opencode

Use when building Go-based Kubernetes agents/controllers, reconcile loops, or cloud-native systems. Invoke for controller-runtime, CRDs, leader election, and Go concurrency.

2026-02-129

name	pyspark-databricks
description	Build and optimize PySpark pipelines on Databricks.

PySpark Databricks

What I do

Build PySpark ETL pipelines on Databricks
Optimize Spark jobs for performance and cost
Apply Delta Lake patterns for reliability

When to use me

Use when you need help authoring or tuning PySpark on Databricks. Ask clarifying questions about data volume, schema, and SLAs.

Quick checklist

Avoid collect on large data
Partition and prune data sources
Cache judiciously for reuse
Use Delta Lake tables
Prefer Spark SQL/DataFrame API

Minimal examples

from pyspark.sql import functions as F

events = spark.read.format("parquet").load("/mnt/raw/events/")
users = spark.read.option("header", "true").csv("/mnt/raw/users.csv")

result = (
    events.join(users, "user_id")
    .groupBy("country")
    .agg(F.count("*").alias("events"))
)

(result.write.format("delta")
 .mode("overwrite")
 .partitionBy("country")
 .save("/mnt/delta/event_counts"))

Output format

## PySpark Pipeline Update

### Summary
- [What changed]
- [Optimization or reliability gain]

### Code
```python
[PySpark code]

Notes

[Assumptions]
[Next steps]