// Production-grade DevOps patterns with Kubernetes 1.34+, Terraform 1.9+, Docker 27+, ArgoCD/FluxCD GitOps, SRE, eBPF-based observability, AI-driven monitoring, CI/CD security, and cloud-native operations (AWS, GCP, Azure, Kafka).
| name | ops-devops-platform |
| description | Production-grade DevOps patterns with Kubernetes 1.34+, Terraform 1.9+, Docker 27+, ArgoCD/FluxCD GitOps, SRE, eBPF-based observability, AI-driven monitoring, CI/CD security, and cloud-native operations (AWS, GCP, Azure, Kafka). |
This skill equips Claude with actionable templates, checklists, and patterns for building self-service platforms, automating infrastructure with GitOps, deploying securely with DevSecOps, scaling with Kubernetes, ensuring reliability through SRE practices, and operating production systems with AI-driven observability.
Modern Best Practices (December 2025): Kubernetes 1.34 (in-place Pod resource updates GA, 1.35 releasing Dec 17), Docker 27 with BuildKit optimizations, Terraform 1.9+ with improved provider ecosystem, ArgoCD 2.14/FluxCD 2.5 GitOps patterns, eBPF-based observability (Cilium, Hubble), and AI-driven AIOps for incident correlation.
| Task | Tool/Framework | Command | When to Use |
|---|---|---|---|
| Infrastructure as Code | Terraform 1.9+ | terraform plan && terraform apply | Provision cloud resources declaratively |
| GitOps Deployment | ArgoCD / FluxCD | argocd app sync myapp | Continuous reconciliation, declarative deployments |
| Container Build | Docker 27+ | docker build -t app:v1 . | Package applications with dependencies |
| Kubernetes Deployment | kubectl / Helm (K8s 1.34+) | kubectl apply -f deploy.yaml / helm upgrade app ./chart | Deploy to K8s cluster, manage releases |
| CI/CD Pipeline | GitHub Actions | Define workflow in .github/workflows/ci.yml | Automated testing, building, deploying |
| Security Scanning | Trivy / Falco | trivy image myapp:latest | Vulnerability scanning, runtime security |
| Monitoring & Alerts | Prometheus + Grafana | Configure ServiceMonitor and AlertManager | Observability, SLO tracking, incident alerts |
| Load Testing | k6 / Locust | k6 run load-test.js | Performance validation, capacity planning |
| Incident Response | PagerDuty / Opsgenie | Configure escalation policies | On-call management, automated escalation |
| Platform Engineering | Backstage / Port | Deploy internal developer portal | Self-service infrastructure, golden paths |
What do you need to accomplish?
โโ Infrastructure provisioning?
โ โโ Cloud-agnostic โ Terraform (multi-cloud support)
โ โโ AWS-specific โ CloudFormation or Terraform
โ โโ GCP-specific โ Deployment Manager or Terraform
โ โโ Azure-specific โ ARM templates or Terraform
โ
โโ Application deployment?
โ โโ Kubernetes cluster?
โ โ โโ Simple deploy โ kubectl apply -f manifests/
โ โ โโ Complex app โ Helm charts
โ โ โโ GitOps workflow โ ArgoCD or FluxCD
โ โโ Serverless?
โ โโ AWS โ Lambda + SAM/Serverless Framework
โ โโ GCP โ Cloud Functions
โ โโ Azure โ Azure Functions
โ
โโ CI/CD pipeline setup?
โ โโ GitHub-based โ GitHub Actions (template-github-actions.md)
โ โโ GitLab-based โ GitLab CI
โ โโ Enterprise โ Jenkins or Tekton
โ โโ Security-first โ Add SAST/DAST/SCA scans (template-ci-cd.md)
โ
โโ Observability & monitoring?
โ โโ Metrics โ Prometheus + Grafana
โ โโ Distributed tracing โ Jaeger or OpenTelemetry
โ โโ Logs โ Loki or ELK stack
โ โโ eBPF-based โ Cilium + Hubble (sidecarless)
โ โโ Unified platform โ Datadog or New Relic
โ
โโ Incident management?
โ โโ On-call rotation โ PagerDuty or Opsgenie
โ โโ Postmortem โ template-postmortem.md
โ โโ Communication โ template-incident-comm.md
โ
โโ Platform engineering?
โ โโ Self-service โ Backstage or Port (internal developer portal)
โ โโ Policy enforcement โ OPA/Gatekeeper
โ โโ Golden paths โ Template repositories + automation
โ
โโ Security hardening?
โโ Container scanning โ Trivy or Grype
โโ Runtime security โ Falco or Sysdig
โโ Secrets management โ HashiCorp Vault or cloud-native KMS
โโ Compliance โ CIS Benchmarks, template-security-hardening.md
Claude should invoke this skill when users request:
Operational best practices by domain:
Each guide includes:
Production templates organized by tech stack (27 templates total):
Resources
Shared Utilities (Centralized patterns โ extract, don't duplicate)
Templates
Data
Operations & Infrastructure:
Security & Compliance:
Software Development:
AI/ML Operations:
See resources/operational-patterns.md for:
See data/sources.json for 45+ curated sources organized by tech stack:
Use this skill as a hub for safe, modern, and production-grade DevOps patterns. All templates and patterns are operationalโno theory or book summaries.