en un clic
aws-infrastructure
// AWS cloud infrastructure inspection. Use when investigating EC2 instances, ECS tasks/services, Lambda functions, CloudWatch logs/metrics, or AWS resource issues.
// AWS cloud infrastructure inspection. Use when investigating EC2 instances, ECS tasks/services, Lambda functions, CloudWatch logs/metrics, or AWS resource issues.
| name | aws-infrastructure |
| description | AWS cloud infrastructure inspection. Use when investigating EC2 instances, ECS tasks/services, Lambda functions, CloudWatch logs/metrics, or AWS resource issues. |
| allowed-tools | Bash(python *) |
IMPORTANT: Credentials are fetched automatically from the credential resolver. Do NOT check for AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY in environment variables — they won't be visible to you. Just run the scripts directly; authentication is handled transparently.
Configuration you CAN check (non-secret):
CONFIGURED_INTEGRATIONS — JSON list, check if aws is presentStart with CloudWatch logs or metrics, then drill into resources.
CLOUDWATCH LOGS/METRICS → IDENTIFY RESOURCE → DESCRIBE RESOURCE → CHECK ALARMS
All scripts are in .claude/skills/infrastructure-aws/scripts/
python .claude/skills/infrastructure-aws/scripts/get_cloudwatch_logs.py --log-group /aws/lambda/my-function --filter "ERROR" --hours 1
python .claude/skills/infrastructure-aws/scripts/get_cloudwatch_logs.py --log-group /ecs/my-service --filter "Exception" --hours 6 --limit 50
python .claude/skills/infrastructure-aws/scripts/get_cloudwatch_metrics.py --namespace AWS/EC2 --metric CPUUtilization --dimension InstanceId=i-1234567890abcdef0 --hours 1
python .claude/skills/infrastructure-aws/scripts/get_cloudwatch_metrics.py --namespace AWS/ECS --metric CPUUtilization --dimension ClusterName=prod,ServiceName=api --hours 6 --period 300
python .claude/skills/infrastructure-aws/scripts/list_cloudwatch_alarms.py
python .claude/skills/infrastructure-aws/scripts/list_cloudwatch_alarms.py --state ALARM
python .claude/skills/infrastructure-aws/scripts/list_ec2_instances.py
python .claude/skills/infrastructure-aws/scripts/list_ec2_instances.py --filters "Name=tag:env,Values=production"
python .claude/skills/infrastructure-aws/scripts/list_ec2_instances.py --filters "Name=instance-state-name,Values=running"
python .claude/skills/infrastructure-aws/scripts/describe_ec2_instance.py --instance-id i-1234567890abcdef0
python .claude/skills/infrastructure-aws/scripts/list_ecs_services.py --cluster prod-api
python .claude/skills/infrastructure-aws/scripts/list_ecs_services.py --cluster prod-api --json
python .claude/skills/infrastructure-aws/scripts/describe_ecs_service.py --cluster prod-api --service api-handler
python .claude/skills/infrastructure-aws/scripts/list_lambda_functions.py
python .claude/skills/infrastructure-aws/scripts/list_lambda_functions.py --prefix api-
python .claude/skills/infrastructure-aws/scripts/get_lambda_logs.py --function-name api-handler --hours 1
python .claude/skills/infrastructure-aws/scripts/get_lambda_logs.py --function-name api-handler --filter "ERROR" --limit 20
1. list_ec2_instances.py --filters "Name=instance-state-name,Values=running"
2. describe_ec2_instance.py --instance-id <id>
3. get_cloudwatch_metrics.py --namespace AWS/EC2 --metric CPUUtilization --dimension InstanceId=<id>
4. list_cloudwatch_alarms.py --state ALARM
1. list_ecs_services.py --cluster <cluster>
2. describe_ecs_service.py --cluster <cluster> --service <service>
3. get_cloudwatch_logs.py --log-group /ecs/<service> --filter "ERROR" --hours 1
4. get_cloudwatch_metrics.py --namespace AWS/ECS --metric CPUUtilization --dimension ClusterName=<cluster>,ServiceName=<service>
1. list_lambda_functions.py
2. get_lambda_logs.py --function-name <function> --filter "ERROR" --hours 1
3. get_cloudwatch_metrics.py --namespace AWS/Lambda --metric Errors --dimension FunctionName=<function>
4. get_cloudwatch_metrics.py --namespace AWS/Lambda --metric Duration --dimension FunctionName=<function>
1. list_ec2_instances.py
2. list_ecs_services.py --cluster <cluster>
3. list_lambda_functions.py
4. get_cloudwatch_metrics.py (check utilization of resources)
When reporting findings, use this structure:
## AWS Analysis
**Service**: <EC2/ECS/Lambda/CloudWatch>
**Region**: <region>
**Resource**: <resource identifier>
### Findings
- [Finding with evidence]
### Root Cause Hypothesis
[Based on metrics and logs]
### Recommended Action
[Specific remediation step]
Kubernetes debugging methodology and scripts. Use for pod crashes, CrashLoopBackOff, OOMKilled, deployment issues, resource problems, or container failures.
GitLab project management, CI/CD pipelines, merge requests, and code review. Use when investigating GitLab projects, pipeline failures, merge requests, commits, or issues.
Slack integration for incident communication. Use when searching for context in incident channels, posting status updates, or finding discussions about issues.
ClickUp project management integration for incident tracking and task management
Infrastructure debugging for Kubernetes and AWS. Use when investigating pod crashes, deployment issues, resource problems, container failures, or cloud infrastructure issues.
Safe remediation actions for Kubernetes. Use when proposing or executing pod restarts, deployment scaling, or rollbacks. Always use dry-run first.