with one click
monitoring-skill
Monitoring and observability with Prometheus, Grafana, ELK Stack, and distributed tracing.
Menu
Monitoring and observability with Prometheus, Grafana, ELK Stack, and distributed tracing.
| name | monitoring-skill |
| description | Monitoring and observability with Prometheus, Grafana, ELK Stack, and distributed tracing. |
| sasmp_version | 1.3.0 |
| bonded_agent | 06-monitoring-observability |
| bond_type | PRIMARY_BOND |
| parameters | [{"name":"pillar","type":"string","required":false,"enum":["metrics","logs","traces","all"],"default":"all"},{"name":"tool","type":"string","required":false,"enum":["prometheus","grafana","elk","jaeger"],"default":"prometheus"}] |
| retry_config | {"strategy":"exponential_backoff","initial_delay_ms":1000,"max_retries":3} |
| observability | {"logging":"structured","metrics":"enabled"} |
Master the three pillars of observability: metrics, logs, and traces.
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| pillar | string | No | all | Observability pillar |
| tool | string | No | prometheus | Tool focus |
# PromQL
sum(rate(http_requests_total[5m])) by (service)
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
100 * sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))
# Prometheus API
curl http://localhost:9090/api/v1/targets
curl 'http://localhost:9090/api/v1/query?query=up'
curl -X POST http://localhost:9090/-/reload
# Alertmanager
amtool silence add alertname="HighLatency" --duration=2h
amtool alert
| Signal | Metric |
|---|---|
| Latency | histogram_quantile(0.99, ...) |
| Traffic | sum(rate(requests_total[5m])) |
| Errors | rate(errors_total[5m]) |
| Saturation | node_memory_MemAvailable_bytes |
| Symptom | Root Cause | Solution |
|---|---|---|
| No data | Scrape failing | Check targets page |
| Alert not firing | PromQL error | Test in UI |
| High cardinality | Too many labels | Reduce labels |
| Slow queries | Too much data | Add aggregation |
/targetsjournalctl -u prometheusDevOps scripting with Bash, Python, and Go for automation, tooling, and infrastructure management
CI/CD pipelines with Git, GitHub Actions, GitLab CI, Jenkins, and deployment strategies.
Cloud infrastructure with AWS, Azure, GCP - architecture, services, security, and cost optimization.
Docker and Kubernetes - containerization, orchestration, and production deployment.
Infrastructure as Code with Terraform, Ansible, and CloudFormation.
Complete Linux administration skill covering process management, filesystem, permissions, package management, users, bash scripting, and system monitoring.