| name | tsh-implementing-kubernetes |
| description | Kubernetes deployment patterns, Helm charts, and cluster management. Use when deploying applications to K8s, designing workload configurations, implementing scaling strategies, or managing cluster resources. |
| user-invocable | false |
Kubernetes Patterns
When to Use
- Deploying applications to Kubernetes
- Designing Deployment, StatefulSet, or Job configurations
- Implementing auto-scaling (HPA, VPA, KEDA)
- Creating or modifying Helm charts
- Setting up ingress, networking, and service mesh
- Configuring resource requests, limits, and QoS
Project Detection
Check which Kubernetes tooling the project uses:
helm/ or Chart.yaml ā Helm charts
kustomize/ or kustomization.yaml ā Kustomize
k8s/ or kubernetes/ with *.yaml ā Raw manifests
skaffold.yaml ā Skaffold for local dev
argocd/ or Application resources ā ArgoCD GitOps
flux-system/ or Kustomization CRD ā Flux GitOps
Use context7 to look up Kubernetes API versions and syntax.
Workload Type Decision
| Workload Type | Use When |
|---|
| Deployment | Stateless apps, web servers, APIs |
| StatefulSet | Databases, stateful apps needing stable identity |
| DaemonSet | Node-level agents (logging, monitoring) |
| Job | One-time tasks, batch processing |
| CronJob | Scheduled recurring tasks |
Deployment Configuration
Resource Management
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
Rules:
- Always set requests (required for scheduling)
- Set memory limits to prevent OOM impact on node
- CPU limits optional (can cause throttling)
- Request:Limit ratio of 1:2 is good starting point
QoS Classes
| Class | Condition | Eviction Priority |
|---|
| Guaranteed | requests == limits (all containers) | Last to evict |
| Burstable | requests < limits | Medium |
| BestEffort | No requests or limits | First to evict |
Rule: Production workloads should be Guaranteed or Burstable, never BestEffort.
Probes Configuration
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
Rules:
- Always configure readinessProbe (graceful traffic handling)
- Use startupProbe for slow-starting apps (instead of long initialDelaySeconds)
- livenessProbe should check app health, not dependencies
Pod Disruption Budget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: api
Rule: Always create PDB for production workloads to ensure availability during node drains.
Pod Anti-Affinity
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: api
topologyKey: kubernetes.io/hostname
Rule: Spread replicas across nodes/zones for high availability.
Scaling Strategies
Horizontal Pod Autoscaler (HPA)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300
Scaling Decision Matrix
| Scaling Type | Use When | Tool |
|---|
| CPU-based | General compute workloads | HPA |
| Memory-based | Memory-intensive apps | HPA |
| Custom metrics | Queue depth, request rate | HPA + Prometheus Adapter |
| Event-driven | Message queues, scheduled jobs | KEDA |
| Vertical | Right-sizing requests/limits | VPA |
Helm Chart Structure
mychart/
āāā Chart.yaml # Chart metadata
āāā values.yaml # Default values
āāā values-dev.yaml # Environment overrides
āāā values-prod.yaml
āāā templates/
ā āāā _helpers.tpl # Template helpers
ā āāā deployment.yaml
ā āāā service.yaml
ā āāā ingress.yaml
ā āāā hpa.yaml
ā āāā pdb.yaml
ā āāā configmap.yaml
āāā charts/ # Dependencies
Helm Best Practices
replicaCount: 2
image:
repository: myapp
tag: ""
pullPolicy: IfNotPresent
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
Rules:
- Don't hardcode image tags in values.yaml (set in CI)
- Use
{{ include "mychart.fullname" . }} for resource names
- Provide sensible defaults, override per environment
Ingress Configuration
Ingress Class Decision
| Ingress Controller | Use When |
|---|
| nginx-ingress | General purpose, widely supported |
| AWS ALB | AWS-native, integrated with WAF/ACM |
| Traefik | Simple setup, automatic HTTPS |
| Istio Gateway | Service mesh already in use |
Ingress Example (nginx)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: nginx
tls:
- hosts:
- api.example.com
secretName: api-tls
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api
port:
number: 80
Security Context
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
Rule: Always run as non-root with minimal capabilities in production.
Process
- Discover context ā Check existing K8s manifests, Helm charts, Kustomize
- Choose workload type ā Deployment, StatefulSet, Job based on requirements
- Configure resources ā Set requests/limits based on profiling or estimates
- Add probes ā Configure readiness, liveness, and startup probes
- Enable scaling ā Add HPA/KEDA based on scaling requirements
- Add resilience ā PDB, pod anti-affinity, topology spread
- Configure security ā Security context, network policies
- Validate ā
kubectl apply --dry-run=server, helm template
Checklist
Anti-Patterns
| Don't | Do |
|---|
Use latest image tag | Pin specific versions or SHA |
| Skip resource requests | Always set requests for scheduling |
| Single replica in production | Minimum 2 replicas with PDB |
| Run as root | Use non-root user with minimal caps |
| Missing readiness probe | Configure probes for graceful traffic |
kubectl apply in production | GitOps with ArgoCD/Flux |
| Hardcode values in manifests | Use Helm values or Kustomize overlays |
| Ignore pod eviction | Set PDB to maintain availability |
Related Skills
tsh-implementing-observability - For K8s monitoring and logging setup
tsh-implementing-ci-cd - For K8s deployment pipelines
tsh-managing-secrets - For K8s secret management patterns
tsh-implementing-terraform-modules - For provisioning K8s clusters