Execute qualquer Skill no Manus
com um clique

Execute qualquer Skill no Manus com um clique

$pwd:

monitoring-stack

Name: Monitoring Stack
Author: delorenj

// Manage the homelab monitoring stack at ~/docker/stacks/monitoring/. Services: Prometheus, Grafana, Alertmanager (Telegram via DeLoNETBot), cadvisor, node-exporter, process-exporter, Loki, OTEL collector, Dockge, Uptime Kuma, health-monitor. Use when: (1) adding, editing, or debugging Prometheus alert rules, (2) managing or restarting monitoring services, (3) checking alert delivery or Telegram bot status, (4) diagnosing system performance issues (CPU hogs, memory bloat, swap pressure), (5) adding new scrape targets to Prometheus, (6) configuring Grafana dashboards or datasources, (7) any task referencing "monitoring", "alerts", "prometheus", "grafana", "cadvisor", "process-exporter", "alertmanager", "telegram alerts", or "DeLoNETBot".

Executar no Manus

$ git log --oneline --stat

stars:9

forks:1

updated:25 de março de 2026 às 06:53

Explorador de arquivos

4 arquivos

SKILL.md

readonly

name

monitoring-stack

description

Manage the homelab monitoring stack at ~/docker/stacks/monitoring/. Services: Prometheus, Grafana, Alertmanager (Telegram via DeLoNETBot), cadvisor, node-exporter, process-exporter, Loki, OTEL collector, Dockge, Uptime Kuma, health-monitor. Use when: (1) adding, editing, or debugging Prometheus alert rules, (2) managing or restarting monitoring services, (3) checking alert delivery or Telegram bot status, (4) diagnosing system performance issues (CPU hogs, memory bloat, swap pressure), (5) adding new scrape targets to Prometheus, (6) configuring Grafana dashboards or datasources, (7) any task referencing "monitoring", "alerts", "prometheus", "grafana", "cadvisor", "process-exporter", "alertmanager", "telegram alerts", or "DeLoNETBot".

Monitoring Stack

Stack root: ~/docker/stacks/monitoring/

Architecture

See references/architecture.md for full service map, ports, URLs, and data flow.

Common Operations

Restart the full stack

cd ~/docker/stacks/monitoring && docker compose up -d

Restart a single service

cd ~/docker/stacks/monitoring && docker compose up -d <service-name>

Hot-reload Prometheus rules (no restart needed)

curl -s -X POST http://localhost:9472/-/reload

Hot-reload Alertmanager config (no restart needed)

curl -s -X POST http://localhost:9784/-/reload

Verify process-exporter is scraping

curl -s http://localhost:9256/metrics | grep namedprocess | head -5

Test Telegram alert delivery

import urllib.request, json
url = "https://api.telegram.org/bot<TOKEN>/sendMessage"
data = json.dumps({"chat_id": 7564050286, "text": "Test alert", "parse_mode": "HTML"}).encode()
req = urllib.request.Request(url, data=data, headers={"Content-Type": "application/json"})
urllib.request.urlopen(req)

Bot token is in ~/docker/stacks/monitoring/alertmanager/config.yml.

Adding New Alert Rules

See references/alert-rules-guide.md for rule file locations, PromQL patterns, severity conventions, and examples.

Quick path: Edit the appropriate rule file, then hot-reload Prometheus:

curl -s -X POST http://localhost:9472/-/reload

Rule file locations

File	Scope
`prometheus/alert_rules.yml`	Container CPU, memory, availability
`prometheus/system_alerts.yml`	Host system + per-process hogs
`prometheus/rules/docker-health.yml`	Container health checks, restart loops

Alerting

Alerts route to Telegram via @DeLoNETBot (chat_id: 7564050286).

Config: alertmanager/config.yml

Severity routing:

critical: repeat every 1h
warning: repeat every 4h

Key Design Decisions

cadvisor is resource-capped: 1 CPU, 512M memory, 30s housekeeping interval, pruned metrics. Without these limits it will eat an entire core scanning 69+ containers.
process-exporter exists specifically to catch host-level hogs that cadvisor and node-exporter miss (per-process CPU/memory).
Duplicate rules were consolidated: docker-health.yml only has health-check and restart-loop rules. CPU/memory rules live in alert_rules.yml.

Diagnosing Performance Issues

When the system stutters, check in this order:

ps aux --sort=-%cpu | head -15 - find CPU hogs
free -h - check swap pressure
sensors - check thermals (Tctl)
docker stats --no-stream - find container hogs

The alert rules should catch most of these automatically now. If they don't fire, check:

Prometheus is up: curl http://localhost:9472/-/healthy
Alertmanager is up: curl http://localhost:9784/-/healthy
Rules loaded: curl http://localhost:9472/api/v1/rules | python3 -m json.tool | head -40

related-skills.json

mesmo repositório

openclaw-agent-management.md

from "delorenj/skills"

This skill codifies the standards and workflows for this machine's openclaw agent gateway configuration. Use this skill when provisioning new agents or managing existing ones. Trigger phrases are "add a new agent", "update [agent's name] identity"

2026-03-309

33god.md

from "delorenj/skills"

Unified 33GOD master skill and router. Use for any 33GOD request: architecture context, project creation, task execution, coding workflow, service development, workflow generation, and platform-level orchestration. This skill routes to focused references/workflows/scripts for incremental discovery.

2026-03-309

ecosystem-patterns.md

from "delorenj/skills"

Use this when creating new projects, generating documentation, cleaning/organizing a repo, suggesting architecture, deploying containers and services, naming files/folders, or when the user references 'ecosystem', 'patterns', or 'containers'. This skill outlines naming conventions, stack preferences, project organization (iMi worktrees), Docker patterns, and PRD structures from past conversations.

2026-03-309

mise-tasks.md

from "delorenj/skills"

Orchestrate multi-step project workflows using mise task definitions with dependency management and argument handling. Use whenever the user wants to create, edit, or debug mise tasks, wire up task dependencies with depends/depends_post, or run workflows via 'mise run'. Also use when setting up task runners or automating build pipelines through mise. Do NOT use for mise environment variable configuration (use mise-configuration instead) or for general shell scripting unrelated to mise.

2026-03-309

obsidian-vault-management.md

from "delorenj/skills"

Codifies the standards and conventions for the creation, organization, categorization, and aggregation of data in the DeLoDocs obsidian vault. Use this skill when asked to add something 'to my vault' or 'in my vault', when creating any reports, instructions, updates, or any markdown documents that are intended to be viewed, and are NOT part of a project or repository.

2026-03-309

bub-email-triage.md

from "delorenj/skills"

Bub's canonical workflow for inbox/calendar triage escalation filtering (known contacts, money/security risk, high-signal AI changes).

2026-03-259

package.json

"author": "delorenj"

"repository": "delorenj/skills"

Abrir repositório GitHub Ver repositórios do creator

$ install --global

$ download --local

Executar no Manus

$ useful --forSOC

Administradores de redes e sistemas de computadorInformática e Matemática15-1244L4

name

monitoring-stack

description

Monitoring Stack

Stack root: ~/docker/stacks/monitoring/

Architecture

See references/architecture.md for full service map, ports, URLs, and data flow.

Common Operations

Restart the full stack

cd ~/docker/stacks/monitoring && docker compose up -d

Restart a single service

cd ~/docker/stacks/monitoring && docker compose up -d <service-name>

Hot-reload Prometheus rules (no restart needed)

curl -s -X POST http://localhost:9472/-/reload

Hot-reload Alertmanager config (no restart needed)

curl -s -X POST http://localhost:9784/-/reload

Verify process-exporter is scraping

curl -s http://localhost:9256/metrics | grep namedprocess | head -5

Test Telegram alert delivery

import urllib.request, json
url = "https://api.telegram.org/bot<TOKEN>/sendMessage"
data = json.dumps({"chat_id": 7564050286, "text": "Test alert", "parse_mode": "HTML"}).encode()
req = urllib.request.Request(url, data=data, headers={"Content-Type": "application/json"})
urllib.request.urlopen(req)

Bot token is in ~/docker/stacks/monitoring/alertmanager/config.yml.

Adding New Alert Rules

See references/alert-rules-guide.md for rule file locations, PromQL patterns, severity conventions, and examples.

Quick path: Edit the appropriate rule file, then hot-reload Prometheus:

curl -s -X POST http://localhost:9472/-/reload

Rule file locations

File	Scope
`prometheus/alert_rules.yml`	Container CPU, memory, availability
`prometheus/system_alerts.yml`	Host system + per-process hogs
`prometheus/rules/docker-health.yml`	Container health checks, restart loops

Alerting

Alerts route to Telegram via @DeLoNETBot (chat_id: 7564050286).

Config: alertmanager/config.yml

Severity routing:

critical: repeat every 1h
warning: repeat every 4h

Key Design Decisions

cadvisor is resource-capped: 1 CPU, 512M memory, 30s housekeeping interval, pruned metrics. Without these limits it will eat an entire core scanning 69+ containers.
process-exporter exists specifically to catch host-level hogs that cadvisor and node-exporter miss (per-process CPU/memory).
Duplicate rules were consolidated: docker-health.yml only has health-check and restart-loop rules. CPU/memory rules live in alert_rules.yml.

Diagnosing Performance Issues

When the system stutters, check in this order:

ps aux --sort=-%cpu | head -15 - find CPU hogs
free -h - check swap pressure
sensors - check thermals (Tctl)
docker stats --no-stream - find container hogs

The alert rules should catch most of these automatically now. If they don't fire, check:

Prometheus is up: curl http://localhost:9472/-/healthy
Alertmanager is up: curl http://localhost:9784/-/healthy
Rules loaded: curl http://localhost:9472/api/v1/rules | python3 -m json.tool | head -40

monitoring-stack

Monitoring Stack

Architecture

Common Operations

Restart the full stack

Restart a single service

Hot-reload Prometheus rules (no restart needed)

Hot-reload Alertmanager config (no restart needed)

Verify process-exporter is scraping

Test Telegram alert delivery

Adding New Alert Rules

Rule file locations

Alerting

Key Design Decisions

Diagnosing Performance Issues

Mais deste repositório

Monitoring Stack

Architecture

Common Operations

Restart the full stack

Restart a single service

Hot-reload Prometheus rules (no restart needed)

Hot-reload Alertmanager config (no restart needed)

Verify process-exporter is scraping

Test Telegram alert delivery

Adding New Alert Rules

Rule file locations

Alerting

Key Design Decisions

Diagnosing Performance Issues

Mais deste repositório