一键在 Manus 中运行任何 Skill

distributed-systems

星标9

分支1

更新时间2026年3月12日 04:50

This skill should be used when building "multi-node Elixir systems", choosing between "Raft and CRDTs", configuring "libcluster or Partisan", debugging "split-brain or netsplit scenarios", evaluating "CP vs AP tradeoffs", investigating "data loss" in distributed pipelines, working with "Kafka", "RabbitMQ", "Broadway", or "PubSub" message delivery guarantees, or designing "event streaming" architectures

安装

用 Codex 或 Claude 帮你安装复制这段 Prompt，粘贴到 Codex、Claude 或其他助手里，让它检查 Skill 页面并帮你完成安装。

在 Manus 中运行

来源

jeffweiss

jeffweiss/elixir-production

打开 GitHub 仓库查看创作者相关仓库

下载

在 Manus 中运行

Distributed Systems Patterns

Overview

Don't distribute unless you must. When you must, choose the lowest escalation level that meets requirements. Each level adds complexity, failure modes, and operational burden — only escalate when you've outgrown the current level.

Escalation Decision

Single node handles the load?
  YES → Level 0 (don't distribute)
  NO  → Do you need shared state across nodes?
          NO  → Level 1 (stateless multi-node)
          YES → Is eventual consistency acceptable?
                  YES → Is it just presence/registry?
                          YES → Level 2 (lightweight)
                          NO  → Level 4 (CRDTs)
                  NO  → Level 3 (Raft consensus)
                         Cluster > 50 nodes? → Level 5

See escalation-ladder.md for full details on each level with code examples and migration triggers.

Quick Decision

Need exactly-one process across cluster?  → leader-election.md
Need multi-service transactions?          → distributed-transactions.md
Need to partition data across nodes?      → consistent-hashing.md
Need audit trail / event replay?          → event-sourcing.md
Need strong consistency?                  → consensus.md (Raft)
Need eventual consistency?                → consensus.md (CRDTs)
Need cluster membership / discovery?      → clustering.md, gossip-protocols.md
Need to understand failure modes?         → failure-modes.md
Preparing for production deployment?      → production-checklist.md

Common Mistakes

Distributing too early: A single BEAM node handles millions of processes. Start at Level 0.
Choosing Raft when CRDTs suffice: If eventual consistency is acceptable, CRDTs avoid all consensus complexity.
Ignoring asymmetric partitions: Real networks produce one-way failures that violate Raft assumptions.
Single Global Process: GenServer-as-cache becomes a bottleneck and SPOF on a cluster.
Trusting CP/AP labels: Verify actual guarantees, don't trust vendor marketing.

Reference Files

Read the file that matches your current problem:

leader-election.md — When: Need exactly-one process across cluster. :global, :pg, Horde, Oban cron, singleton patterns, netsplit behavior for each approach
distributed-transactions.md — When: Multiple services must coordinate atomically. Saga (choreography + orchestration), Oban workflows, compensating transactions, why not 2PC
gossip-protocols.md — When: Need efficient cluster-wide information spread. SWIM failure detection, epidemic broadcast, delta-CRDTs, HyParView/Partisan
consistent-hashing.md — When: Partitioning data or load across nodes. Ring hashing, jump consistent hash, virtual nodes, sharded ETS, Broadway partitioning
event-sourcing.md — When: Need audit trail or event replay. Event sourcing, CQRS, Commanded patterns, projections, when to use vs CRUD
escalation-ladder.md — When: Deciding whether/how to distribute. Full Distribution Escalation Ladder (Levels 0-5) with code examples
consensus.md — When: Need strong or eventual consistency guarantees. Raft via :ra, CRDTs via delta_crdt, Multi-Raft, Vector Clocks, Paxos
clustering.md — When: Setting up node discovery and communication. Distributed Erlang, Partisan, libcluster configs, PubSub, debugging patterns
failure-modes.md — When: Debugging distributed bugs or designing for resilience. Split-brain, fencing tokens, clock drift, gray failures, metastable failures, limplocks, concurrency bugs, blast radius
cap-tradeoffs.md — When: Evaluating consistency vs availability tradeoffs. CAP theorem critique, fundamental limits, consistency models
production-checklist.md — When: Preparing distributed system for deployment. 20-item deployment checklist
resilience-principles.md — When: Designing fault-tolerant architecture. Architectural principles from Hebert, Brooker, Luu, Kleppmann, Cook et al.

Commands

/distributed-review — Deep analysis of distributed architecture and failure modes
/feature <desc> — Guided implementation with distribution-aware design

Related Skills

elixir-patterns: GenServer, Supervisor, OTP patterns, overload management
production-quality: Monitoring, observability, error handling
phoenix-liveview: Phoenix.PubSub, Phoenix.Tracker

Use the distributed-systems-expert agent for deep analysis of distributed architectures, consensus algorithm selection, and distributed bug investigation.

同仓库更多 Skills

同仓库

algorithms

jeffweiss/elixir-production

This skill should be used when choosing "data structures", "Map vs ETS vs gb_trees performance", evaluating "hash functions", needing "HyperLogLog or bloom/cuckoo filters", or comparing OTP built-ins against specialized alternatives for Elixir

2026-03-129

cognitive-complexity

jeffweiss/elixir-production

This skill should be used when code is hard to understand, a module has "too many parameters or deep nesting", "functions exceed 40 lines", "abstractions feel leaky", or evaluating "deep-module vs shallow-module" design per Ousterhout principles

2026-03-129

elixir-patterns

jeffweiss/elixir-production

This skill should be used when structuring Elixir code, deciding whether to use "GenServer or plain functions", "designing supervision trees", "handling overload or unbounded message queues", "organizing Phoenix contexts", or needing idiomatic OTP patterns

2026-03-129

enforcing-precommit

jeffweiss/elixir-production

This skill should be used when about to run "git commit", "git push", or claim implementation is complete in an Elixir project — requires running mix precommit and confirming output before any commit or completion claim

2026-03-129

performance-analyzer

jeffweiss/elixir-production

This skill should be used when code is reported slow, before suggesting any optimization, when choosing between "cprof/eprof/fprof/tprof", or when creating "Benchee benchmarks" to compare approaches

2026-03-129

phoenix-liveview

jeffweiss/elixir-production

This skill should be used when working with "Phoenix LiveView", "Plug pipelines", "controllers", "JSON APIs", "channels", "streams", "forms", "phx-hook", "authentication", or debugging "Phoenix/LiveView gotchas"

2026-03-129

name

distributed-systems

description

Distributed Systems Patterns

Overview

Escalation Decision

Single node handles the load?
  YES → Level 0 (don't distribute)
  NO  → Do you need shared state across nodes?
          NO  → Level 1 (stateless multi-node)
          YES → Is eventual consistency acceptable?
                  YES → Is it just presence/registry?
                          YES → Level 2 (lightweight)
                          NO  → Level 4 (CRDTs)
                  NO  → Level 3 (Raft consensus)
                         Cluster > 50 nodes? → Level 5

See escalation-ladder.md for full details on each level with code examples and migration triggers.

Quick Decision

Need exactly-one process across cluster?  → leader-election.md
Need multi-service transactions?          → distributed-transactions.md
Need to partition data across nodes?      → consistent-hashing.md
Need audit trail / event replay?          → event-sourcing.md
Need strong consistency?                  → consensus.md (Raft)
Need eventual consistency?                → consensus.md (CRDTs)
Need cluster membership / discovery?      → clustering.md, gossip-protocols.md
Need to understand failure modes?         → failure-modes.md
Preparing for production deployment?      → production-checklist.md

Common Mistakes

Distributing too early: A single BEAM node handles millions of processes. Start at Level 0.
Choosing Raft when CRDTs suffice: If eventual consistency is acceptable, CRDTs avoid all consensus complexity.
Ignoring asymmetric partitions: Real networks produce one-way failures that violate Raft assumptions.
Single Global Process: GenServer-as-cache becomes a bottleneck and SPOF on a cluster.
Trusting CP/AP labels: Verify actual guarantees, don't trust vendor marketing.

Reference Files

Read the file that matches your current problem:

leader-election.md — When: Need exactly-one process across cluster. :global, :pg, Horde, Oban cron, singleton patterns, netsplit behavior for each approach
distributed-transactions.md — When: Multiple services must coordinate atomically. Saga (choreography + orchestration), Oban workflows, compensating transactions, why not 2PC
gossip-protocols.md — When: Need efficient cluster-wide information spread. SWIM failure detection, epidemic broadcast, delta-CRDTs, HyParView/Partisan
consistent-hashing.md — When: Partitioning data or load across nodes. Ring hashing, jump consistent hash, virtual nodes, sharded ETS, Broadway partitioning
event-sourcing.md — When: Need audit trail or event replay. Event sourcing, CQRS, Commanded patterns, projections, when to use vs CRUD
escalation-ladder.md — When: Deciding whether/how to distribute. Full Distribution Escalation Ladder (Levels 0-5) with code examples
consensus.md — When: Need strong or eventual consistency guarantees. Raft via :ra, CRDTs via delta_crdt, Multi-Raft, Vector Clocks, Paxos
clustering.md — When: Setting up node discovery and communication. Distributed Erlang, Partisan, libcluster configs, PubSub, debugging patterns
failure-modes.md — When: Debugging distributed bugs or designing for resilience. Split-brain, fencing tokens, clock drift, gray failures, metastable failures, limplocks, concurrency bugs, blast radius
cap-tradeoffs.md — When: Evaluating consistency vs availability tradeoffs. CAP theorem critique, fundamental limits, consistency models
production-checklist.md — When: Preparing distributed system for deployment. 20-item deployment checklist
resilience-principles.md — When: Designing fault-tolerant architecture. Architectural principles from Hebert, Brooker, Luu, Kleppmann, Cook et al.

Commands

/distributed-review — Deep analysis of distributed architecture and failure modes
/feature <desc> — Guided implementation with distribution-aware design

Related Skills

elixir-patterns: GenServer, Supervisor, OTP patterns, overload management
production-quality: Monitoring, observability, error handling
phoenix-liveview: Phoenix.PubSub, Phoenix.Tracker

Use the distributed-systems-expert agent for deep analysis of distributed architectures, consensus algorithm selection, and distributed bug investigation.