com um clique
remediation
// Safe remediation actions for Kubernetes. Use when proposing or executing pod restarts, deployment scaling, or rollbacks. Always use dry-run first.
// Safe remediation actions for Kubernetes. Use when proposing or executing pod restarts, deployment scaling, or rollbacks. Always use dry-run first.
Kubernetes debugging methodology and scripts. Use for pod crashes, CrashLoopBackOff, OOMKilled, deployment issues, resource problems, or container failures.
GitLab project management, CI/CD pipelines, merge requests, and code review. Use when investigating GitLab projects, pipeline failures, merge requests, commits, or issues.
Slack integration for incident communication. Use when searching for context in incident channels, posting status updates, or finding discussions about issues.
ClickUp project management integration for incident tracking and task management
AWS cloud infrastructure inspection. Use when investigating EC2 instances, ECS tasks/services, Lambda functions, CloudWatch logs/metrics, or AWS resource issues.
Infrastructure debugging for Kubernetes and AWS. Use when investigating pod crashes, deployment issues, resource problems, container failures, or cloud infrastructure issues.
| name | remediation |
| description | Safe remediation actions for Kubernetes. Use when proposing or executing pod restarts, deployment scaling, or rollbacks. Always use dry-run first. |
--dry-run flagAll scripts are in .claude/skills/remediation/scripts/
# Dry run (shows what would happen)
python .claude/skills/remediation/scripts/restart_pod.py <pod-name> -n <namespace> --dry-run
# Execute
python .claude/skills/remediation/scripts/restart_pod.py <pod-name> -n <namespace>
# Dry run
python .claude/skills/remediation/scripts/scale_deployment.py <deployment> -n <namespace> --replicas N --dry-run
# Execute
python .claude/skills/remediation/scripts/scale_deployment.py <deployment> -n <namespace> --replicas N
# Dry run (shows current and target revision)
python .claude/skills/remediation/scripts/rollback_deployment.py <deployment> -n <namespace> --dry-run
# Execute
python .claude/skills/remediation/scripts/rollback_deployment.py <deployment> -n <namespace>
# 1. Check events
python .claude/skills/infrastructure/kubernetes/scripts/get_events.py <pod> -n <namespace>
# 2. If fixable by restart, dry-run first
python .claude/skills/remediation/scripts/restart_pod.py <pod> -n <namespace> --dry-run
# 3. Execute restart
python .claude/skills/remediation/scripts/restart_pod.py <pod> -n <namespace>
# 1. Check history
python .claude/skills/infrastructure/kubernetes/scripts/get_history.py <deployment> -n <namespace>
# 2. Dry-run rollback
python .claude/skills/remediation/scripts/rollback_deployment.py <deployment> -n <namespace> --dry-run
# 3. Execute rollback
python .claude/skills/remediation/scripts/rollback_deployment.py <deployment> -n <namespace>
# 1. Check current state
python .claude/skills/infrastructure/kubernetes/scripts/describe_deployment.py <deployment> -n <namespace>
# 2. Dry-run scale up
python .claude/skills/remediation/scripts/scale_deployment.py <deployment> -n <namespace> --replicas 5 --dry-run
# 3. Execute scale
python .claude/skills/remediation/scripts/scale_deployment.py <deployment> -n <namespace> --replicas 5
When proposing remediation, use this structure:
## Proposed Remediation
**Action**: [e.g., Restart pod, Scale deployment, Rollback]
**Target**: [resource name and namespace]
**Reason**: [why this action will help]
**Risk**: [potential side effects]
### Dry Run Output
[output from --dry-run]
### Confirmation Required
Please confirm you want to proceed with this action.