원클릭으로 Manus에서 모든 스킬 실행

$pwd:

metrics-query-cookbook

Name: Metrics Query Cookbook
Author: kubev2v

// Cookbook of ready-to-use PromQL queries, preset catalog, metric name dictionaries, and label references for Ceph storage, network traffic, pod statistics, and MTV migrations. Use when you need specific queries, exact metric names, or label filters.

Manus에서 실행

$ git log --oneline --stat

stars:0

forks:0

updated:2026년 3월 28일 15:12

SKILL.md

readonly

related-skills.json

같은 저장소

inventory-tool-guide.md

from "kubev2v/mtv-agent"

Guide for querying MTV provider inventory with TSL. Covers available resources per provider, output formats, TSL syntax, common pitfalls, and using inventory for network/storage mapping. Use when querying provider inventory.

2026-05-070

mtv-cli-docs.md

from "kubev2v/mtv-agent"

Find kubectl-mtv CLI documentation links and usage guidance. Use when the user asks about kubectl-mtv commands, CLI migration workflows, TSL query syntax, KARL affinity rules, provider setup via CLI, inventory queries, plan creation flags, or any kubectl-mtv technical guide topic.

2026-04-230

govc.md

from "kubev2v/mtv-agent"

VMware vSphere automation with govc. Use when the user wants to list inventory, power VMs, clone, snapshot, import OVA, or inspect datastores/hosts on vCenter/ESXi. Output CLI for the user to run — do not execute govc yourself unless they ask.

2026-04-080

kubevirt.md

from "kubev2v/mtv-agent"

KubeVirt VM operations via kubectl virt or oc virt. Use when the user wants to create, start, stop, console/VNC/SSH into, or inspect VMs on OpenShift/Kubernetes. Output CLI for the user to run — do not execute kubectl/oc yourself unless they ask.

2026-04-080

metrics-tool-guide.md

from "kubev2v/mtv-agent"

Guide for using the Prometheus/Thanos metrics tools: output rules, query syntax, filtering, discovery, and PromQL reference. Use when the user wants to check cluster metrics and you need to know how to call metrics_read and metrics_help.

2026-04-020

inventory-ec2.md

from "kubev2v/mtv-agent"

Amazon EC2 inventory field reference and TSL query examples. Use when querying EC2 provider inventory (instances, volumes, networks, storage types).

2026-03-280

package.json

"author": "kubev2v"

"repository": "kubev2v/mtv-agent"

GitHub 저장소 열기 Creator 저장소 보기

$ install --global

$ download --local

Manus에서 실행

$ useful --forSOC

네트워크·컴퓨터 시스템 관리자컴퓨터 및 수학직15-1244L4

name	metrics-query-cookbook
description	Cookbook of ready-to-use PromQL queries, preset catalog, metric name dictionaries, and label references for Ceph storage, network traffic, pod statistics, and MTV migrations. Use when you need specific queries, exact metric names, or label filters.

Metrics Query Cookbook

Ready-to-use queries, preset catalog, and metric name/label references for OpenShift clusters with ODF, OVN-Kubernetes, KubeVirt, and Forklift/MTV.

All examples use the kubectl-metrics MCP server tools (metrics_read and metrics_help).

Output format guidance: Use default (markdown) when presenting to user. Use output: "json" only when you need to parse values programmatically. Use selector to filter results by labels post-query.

Preset Catalog

Every preset works as both an instant (default) and range query. Pass start to get a time-series trend.

Cluster & Namespace

Preset	Description
`cluster_cpu_utilization`	Cluster CPU utilization percentage
`cluster_memory_utilization`	Cluster memory utilization percentage
`cluster_pod_status`	Pod counts by phase (Running, Pending, Failed, Succeeded, Unknown)
`cluster_node_readiness`	Node readiness status counts
`namespace_cpu_usage`	Top 10 namespaces by CPU usage (cores)
`namespace_memory_usage`	Top 10 namespaces by memory usage (bytes)
`namespace_network_rx`	Top 10 namespaces by network receive rate
`namespace_network_tx`	Top 10 namespaces by network transmit rate
`namespace_network_errors`	Network errors + drops by namespace (top 10)
`pod_restarts_top10`	Top 10 pods by container restart count

Forklift / MTV Migration

Preset	Description
`mtv_migration_status`	Migration counts by status (succeeded/failed/running)
`mtv_plan_status`	Plan-level status counts
`mtv_migration_duration`	Migration duration per plan (seconds)
`mtv_avg_migration_duration`	Average migration duration (seconds)
`mtv_data_transferred`	Total bytes migrated per plan
`mtv_net_throughput`	Migration network throughput
`mtv_storage_throughput`	Migration storage throughput
`mtv_migration_pod_rx`	Migration pod receive rate (bytes/sec, top 20)
`mtv_migration_pod_tx`	Migration pod transmit rate (bytes/sec, top 20)
`mtv_forklift_traffic`	Forklift operator pod network traffic (bytes/sec)
`mtv_vmi_migrations_pending`	KubeVirt VMI migrations in pending phase
`mtv_vmi_migrations_running`	KubeVirt VMI migrations in running phase

Storage Metrics (Ceph / ODF)

Cluster-wide storage health

metrics_read { "command": "query", "flags": { "query": "ceph_health_status", "output": "markdown" } }

Result: 0 = OK, 1 = WARN, 2 = ERR.

Storage capacity

metrics_read { "command": "query", "flags": { "query": "ceph_cluster_total_bytes", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "ceph_cluster_total_used_bytes", "output": "markdown" } }

Pool-level statistics

metrics_read { "command": "query", "flags": { "query": "ceph_pool_percent_used * 100", "output": "markdown" } }

Pool I/O rates

metrics_read { "command": "query", "flags": { "query": "rate(ceph_pool_rd[5m])", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "rate(ceph_pool_wr[5m])", "output": "markdown" } }

OSD operation latency

metrics_read { "command": "query", "flags": { "query": "rate(ceph_osd_op_latency_sum[5m]) / rate(ceph_osd_op_latency_count[5m])", "output": "markdown" } }

Placement group health

metrics_read { "command": "query", "flags": { "query": "ceph_pg_total", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "ceph_pg_degraded", "output": "markdown" } }

Available labels on ceph_* metrics

Label	Description	Example values
`pool_id`	Ceph pool identifier (pool-level metrics)	`1`, `2`, `3`, `4`
`ceph_daemon`	OSD daemon name (OSD-level metrics)	`osd.0`, `osd.1`, `osd.2`
`namespace`	Storage operator namespace	`openshift-storage`
`managedBy`	Managing resource	`ocs-storagecluster`
`job`	Scrape job	`rook-ceph-mgr`, `rook-ceph-exporter`

Storage metrics reference

Metric	Description
`ceph_health_status`	Overall cluster health (0=OK, 1=WARN, 2=ERR)
`ceph_cluster_total_bytes`	Total cluster capacity
`ceph_cluster_total_used_bytes`	Used cluster capacity
`ceph_pool_percent_used`	Per-pool usage percentage
`ceph_pool_stored`	Bytes stored per pool
`ceph_pool_max_avail`	Available bytes per pool
`ceph_pool_rd`, `ceph_pool_wr`	Read/write IOPS per pool
`ceph_pool_rd_bytes`, `ceph_pool_wr_bytes`	Read/write bytes per pool
`ceph_osd_op_latency_sum/count`	OSD operation latency (use as rate ratio)
`ceph_pg_total`, `ceph_pg_active`, `ceph_pg_degraded`	Placement group counts
`node_filesystem_avail_bytes`, `node_filesystem_size_bytes`	Node filesystem capacity

Network Traffic Metrics

Network traffic by namespace

metrics_read { "command": "preset", "flags": { "name": "namespace_network_rx", "output": "markdown" } }

metrics_read { "command": "preset", "flags": { "name": "namespace_network_tx", "output": "markdown" } }

Network traffic by pod in a namespace

Replace TARGET_NAMESPACE with the actual namespace -- ASK the user if not known.

metrics_read {
  "command": "query",
  "flags": { "query": "topk(10, sort_desc(sum by (pod)(rate(container_network_receive_bytes_total{namespace=\"TARGET_NAMESPACE\"}[5m]))))", "output": "markdown" }
}

metrics_read {
  "command": "query",
  "flags": { "query": "topk(10, sort_desc(sum by (pod)(rate(container_network_transmit_bytes_total{namespace=\"TARGET_NAMESPACE\"}[5m]))))", "output": "markdown" }
}

Network errors and drops by namespace

metrics_read { "command": "preset", "flags": { "name": "namespace_network_errors", "output": "markdown" } }

Node-level network throughput

metrics_read {
  "command": "query",
  "flags": { "query": "instance:node_network_receive_bytes_excluding_lo:rate1m + instance:node_network_transmit_bytes_excluding_lo:rate1m", "output": "markdown" }
}

Available labels on network metrics

Label	Description	Example values
`namespace`	Pod namespace	`openshift-storage`, `konveyor-forklift`
`pod`	Pod name	`forklift-controller-6df77f6bf5-jtt7q`
`interface`	Network interface (per-pod metrics)	`eth0`
`instance`	Node instance (node-level metrics)	`10.0.0.5:9100`
`node`	Node name (node-level metrics)	`worker-0`

Network metrics reference

Metric	Description
`container_network_receive_bytes_total`	Bytes received per pod/namespace
`container_network_transmit_bytes_total`	Bytes transmitted per pod/namespace
`container_network_receive_errors_total`	Receive errors per pod/namespace
`container_network_transmit_errors_total`	Transmit errors per pod/namespace
`container_network_receive_packets_dropped_total`	Dropped receive packets
`container_network_transmit_packets_dropped_total`	Dropped transmit packets
`node_network_receive_bytes_total`	Bytes received per node/interface
`node_network_transmit_bytes_total`	Bytes transmitted per node/interface
`instance:node_network_receive_bytes_excluding_lo:rate1m`	Pre-computed node receive rate
`instance:node_network_transmit_bytes_excluding_lo:rate1m`	Pre-computed node transmit rate

Pod and Container Statistics

Pod count by namespace

metrics_read { "command": "query", "flags": { "query": "topk(15, count by (namespace)(kube_pod_info))", "output": "markdown" } }

Pod phase summary

metrics_read { "command": "preset", "flags": { "name": "cluster_pod_status", "output": "markdown" } }

Container CPU usage by namespace

metrics_read { "command": "preset", "flags": { "name": "namespace_cpu_usage", "output": "markdown" } }

Container memory usage by namespace

metrics_read { "command": "preset", "flags": { "name": "namespace_memory_usage", "output": "markdown" } }

Container restart counts (instability indicator)

metrics_read { "command": "preset", "flags": { "name": "pod_restarts_top10", "output": "markdown" } }

Pods with high recent restarts (use `debug_read` for details)

After finding pods with high restarts, use debug_read to get pod details and logs:

debug_read { "command": "list", "flags": { "resource": "pods", "namespace": "<NAMESPACE>", "query": "where status.containerStatuses[0].restartCount > 5", "output": "markdown" } }

debug_read { "command": "logs", "flags": { "name": "<POD_NAME>", "namespace": "<NAMESPACE>", "tail": 100, "query": "where level = 'ERROR'", "output": "markdown" } }

Available labels on pod/container metrics

Label	Description	Example values
`namespace`	Pod namespace	`konveyor-forklift`, `openshift-cnv`
`pod`	Pod name	`forklift-controller-6df77f6bf5-jtt7q`
`container`	Container name	`main`, `inventory`, `extract`
`node`	Node the pod runs on	`worker-0`, `worker-1`
`phase`	Pod phase (on status metrics)	`Running`, `Pending`, `Failed`, `Succeeded`
`uid`	Pod UID	`793fb1cb-3e58-4eef-b95a-733f237365a3`
`created_by_kind`	Owner resource kind (on kube_pod_info)	`ReplicaSet`, `DaemonSet`, `StatefulSet`
`created_by_name`	Owner resource name (on kube_pod_info)	`forklift-controller-6df77f6bf5`
`host_ip`	Node IP (on kube_pod_info)	`192.168.0.77`
`pod_ip`	Pod IP (on kube_pod_info)	`10.129.3.3`

Pod/container metrics reference

Metric	Description
`kube_pod_info`	Pod metadata (node, namespace, IPs, owner)
`kube_pod_status_phase`	Pod phase (Running/Pending/Failed/Succeeded)
`kube_pod_container_status_restarts_total`	Container restart count
`kube_pod_container_status_waiting_reason`	Waiting reason (CrashLoopBackOff, ImagePullBackOff, etc.)
`container_cpu_usage_seconds_total`	Container CPU usage
`container_memory_working_set_bytes`	Container memory usage
`namespace:container_cpu_usage:sum`	Pre-aggregated CPU by namespace
`namespace:container_memory_usage_bytes:sum`	Pre-aggregated memory by namespace

Forklift / MTV Migration Metrics

Available labels on mtv_* metrics

All mtv_* metrics share these labels for filtering and grouping:

Label	Description	Example values
`provider`	Source provider type	`vsphere`, `ovirt`, `openstack`, `ova`, `ec2`
`mode`	Migration mode	`Cold`, `Warm`
`target`	Target cluster	`Local` (host cluster) or remote cluster name
`owner`	User who owns the migration	`admin@example.com`
`plan`	Migration plan UUID	`363ce137-dace-4fb4-b815-759c214c9fec`
`namespace`	Forklift operator namespace	`konveyor-forklift`, `openshift-mtv`
`status`	Migration/plan status (on status metrics)	`Succeeded`, `Failed`, `Executing`

MTV migration metrics reference

Metric	Description
`mtv_migrations_status_total`	Migration counts by status (succeeded/failed/running)
`mtv_plans_status`	Plan-level status counts
`mtv_migration_data_transferred_bytes`	Total bytes migrated per plan
`mtv_migration_net_throughput`	Migration network throughput
`mtv_migration_storage_throughput`	Migration storage throughput
`mtv_migration_duration_seconds`	Migration duration per plan
`mtv_plan_alert_status`	Alerts on migration plans
`mtv_workload_migrations_status_total`	Per-workload migration status (per plan + status)
`kubevirt_vmi_migrations_in_pending_phase`	Live VMI migrations pending
`kubevirt_vmi_migrations_in_running_phase`	Live VMI migrations in progress

Migration status overview

metrics_read { "command": "preset", "flags": { "name": "mtv_migration_status", "output": "markdown" } }

Migration plan status

metrics_read { "command": "preset", "flags": { "name": "mtv_plan_status", "output": "markdown" } }

Migration data transfer and throughput

metrics_read { "command": "preset", "flags": { "name": "mtv_data_transferred", "output": "markdown" } }

metrics_read { "command": "preset", "flags": { "name": "mtv_net_throughput", "output": "markdown" } }

metrics_read { "command": "preset", "flags": { "name": "mtv_storage_throughput", "output": "markdown" } }

Migration duration

metrics_read { "command": "preset", "flags": { "name": "mtv_migration_duration", "output": "markdown" } }

metrics_read { "command": "preset", "flags": { "name": "mtv_avg_migration_duration", "output": "markdown" } }

Migration alerts

metrics_read { "command": "query", "flags": { "query": "mtv_plan_alert_status", "output": "markdown" } }

Narrowing migration metrics with label filters

Use {label="value"} in PromQL or use the selector flag:

metrics_read { "command": "query", "flags": { "query": "mtv_migration_data_transferred_bytes", "selector": "provider=vsphere", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "mtv_migration_data_transferred_bytes{mode=\"Cold\"}", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "mtv_migration_data_transferred_bytes{provider=\"ovirt\", mode=\"Warm\"}", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "mtv_migrations_status_total{status=\"Failed\"}", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "mtv_workload_migrations_status_total{plan=\"PLAN_UUID\", status=\"Failed\"}", "output": "markdown" } }

Grouping migration metrics

metrics_read { "command": "query", "flags": { "query": "sum by (provider)(mtv_migration_data_transferred_bytes)", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "sum by (mode)(mtv_migration_data_transferred_bytes)", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "sum by (provider, mode)(mtv_migration_data_transferred_bytes)", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "sum by (status, provider)(mtv_migrations_status_total)", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "avg by (provider)(mtv_migration_duration_seconds)", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "sum by (plan, status)(mtv_workload_migrations_status_total)", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "sum by (provider, status)(mtv_plans_status)", "output": "markdown" } }

Network traffic of migration pods

During active Forklift migrations, data-transfer pods run in the target namespace. Migration pod names follow the pattern {plan-name}-{vm-id}-{random} (e.g. test-vmware-metrics-vm-43-tws62).

Step 1 -- Discover migration pods:

VMware/general migration pods (carry a plan label):

debug_read { "command": "list", "flags": { "resource": "pods", "namespace": "<NAMESPACE>", "selector": "plan", "output": "markdown" } }

oVirt/OpenStack populator pods (named populate-{uuid}-...):

debug_read { "command": "list", "flags": { "resource": "pods", "namespace": "<NAMESPACE>", "query": "where name ~= '^populate-'", "output": "markdown" } }

Step 2 -- Query network traffic for discovered pods:

Use the pod names from Step 1 to build a regex filter (replace POD1|POD2 with the actual names):

metrics_read {
  "command": "query",
  "flags": { "query": "topk(10, sort_desc(sum by (pod)(rate(container_network_receive_bytes_total{namespace=\"TARGET_NAMESPACE\",pod=~\"POD1|POD2\"}[5m]))))", "output": "markdown" }
}

metrics_read {
  "command": "query",
  "flags": { "query": "topk(10, sort_desc(sum by (pod)(rate(container_network_transmit_bytes_total{namespace=\"TARGET_NAMESPACE\",pod=~\"POD1|POD2\"}[5m]))))", "output": "markdown" }
}

Short-lived pod network metrics

Pods that run under ~60 seconds (e.g. oVirt/OpenStack populator pods) may not have container-level network metrics (container_network_*). This is because cadvisor needs 1-2 collection cycles (~10-20s) to establish network namespace tracking, and the pod may complete before tracking starts. CPU and memory metrics are unaffected.

Node-level network metrics capture the transfer at the node level. Determine which node ran the pod (spec.nodeName or kube_pod_info), then query RX and TX together:

metrics_read {
  "command": "query_range",
  "flags": {
    "query": [
      "instance:node_network_receive_bytes_excluding_lo:rate1m{instance=~\"NODE_NAME.*\"}",
      "instance:node_network_transmit_bytes_excluding_lo:rate1m{instance=~\"NODE_NAME.*\"}"
    ],
    "name": ["node_rx", "node_tx"],
    "start": "<MIGRATION_START>",
    "end": "<MIGRATION_END>",
    "step": "30s",
    "output": "markdown"
  }
}

Compare against baseline before/after the migration window to isolate transfer traffic.

CPU activity confirms the pod was active during the window:

metrics_read { "command": "query_range", "flags": { "query": "rate(container_cpu_usage_seconds_total{pod=\"<POD>\",namespace=\"<NS>\"}[1m])", "start": "<START>", "end": "<END>", "step": "30s", "output": "markdown" } }

Completed migration historical queries

When querying metrics for a migration that already finished, use the plan's start/completion timestamps as absolute time bounds:

Get timestamps from the plan:

mtv_read { "command": "describe plan", "flags": { "name": "<PLAN>", "namespace": "<NS>", "output": "markdown" } }

Use ISO-8601 start/end in query_range:

metrics_read {
  "command": "query_range",
  "flags": {
    "query": "sum by (pod)(rate(container_network_receive_bytes_total{namespace=\"<NS>\"}[5m]))",
    "start": "2025-06-15T10:00:00Z",
    "end": "2025-06-15T12:30:00Z",
    "step": "60s",
    "output": "markdown"
  }
}

Do not use relative offsets like -1h for completed migrations -- the data may fall outside that window.

Checking migration pod status with `debug_read`

To investigate migration pod issues alongside metrics:

debug_read { "command": "list", "flags": { "resource": "pods", "namespace": "<NAMESPACE>", "selector": "plan", "output": "markdown" } }

debug_read { "command": "logs", "flags": { "name": "<POD_NAME>", "namespace": "<NAMESPACE>", "tail": 100, "query": "where level = 'ERROR'", "output": "markdown" } }

Network traffic of the Forklift operator itself

metrics_read { "command": "preset", "flags": { "name": "mtv_forklift_traffic", "output": "markdown" } }

KubeVirt VMI migration metrics

These track live VM migrations (vMotion-style), not Forklift cold migrations:

metrics_read { "command": "preset", "flags": { "name": "mtv_vmi_migrations_pending", "output": "markdown" } }

metrics_read { "command": "preset", "flags": { "name": "mtv_vmi_migrations_running", "output": "markdown" } }

Quick Health Dashboard

Run key queries for a cluster overview:

metrics_read { "command": "preset", "flags": { "name": "cluster_cpu_utilization", "output": "markdown" } }

metrics_read { "command": "preset", "flags": { "name": "cluster_memory_utilization", "output": "markdown" } }

metrics_read { "command": "query", "flags": { "query": "ceph_health_status", "output": "markdown" } }

metrics_read { "command": "preset", "flags": { "name": "namespace_network_rx", "output": "markdown" } }

metrics_read { "command": "preset", "flags": { "name": "mtv_migration_status", "output": "markdown" } }

metrics-query-cookbook

이 저장소의 다른 Skills

이 저장소의 다른 Skills

Metrics Query Cookbook

Preset Catalog

Cluster & Namespace

Forklift / MTV Migration

Storage Metrics (Ceph / ODF)

Cluster-wide storage health

Storage capacity

Pool-level statistics

Pool I/O rates

OSD operation latency

Placement group health

Available labels on ceph_* metrics

Storage metrics reference

Network Traffic Metrics

Network traffic by namespace

Network traffic by pod in a namespace

Network errors and drops by namespace

Node-level network throughput

Available labels on network metrics

Network metrics reference

Pod and Container Statistics

Pod count by namespace

Pod phase summary

Container CPU usage by namespace

Container memory usage by namespace

Container restart counts (instability indicator)

Pods with high recent restarts (use debug_read for details)

Available labels on pod/container metrics

Pod/container metrics reference

Forklift / MTV Migration Metrics

Available labels on mtv_* metrics

MTV migration metrics reference

Migration status overview

Migration plan status

Migration data transfer and throughput

Migration duration

Migration alerts

Narrowing migration metrics with label filters

Grouping migration metrics

Network traffic of migration pods

Short-lived pod network metrics

Completed migration historical queries

Checking migration pod status with debug_read

Network traffic of the Forklift operator itself

KubeVirt VMI migration metrics

Quick Health Dashboard

Metrics Query Cookbook

Preset Catalog

Cluster & Namespace

Forklift / MTV Migration

Storage Metrics (Ceph / ODF)

Cluster-wide storage health

Storage capacity

Pool-level statistics

Pool I/O rates

OSD operation latency

Placement group health

Available labels on ceph_* metrics

Storage metrics reference

Network Traffic Metrics

Network traffic by namespace

Network traffic by pod in a namespace

Network errors and drops by namespace

Node-level network throughput

Available labels on network metrics

Network metrics reference

Pod and Container Statistics

Pod count by namespace

Pod phase summary

Container CPU usage by namespace

Container memory usage by namespace

Container restart counts (instability indicator)

Pods with high recent restarts (use debug_read for details)

Available labels on pod/container metrics

Pod/container metrics reference

Forklift / MTV Migration Metrics

Available labels on mtv_* metrics

Pods with high recent restarts (use `debug_read` for details)

Checking migration pod status with `debug_read`

Pods with high recent restarts (use `debug_read` for details)

Checking migration pod status with `debug_read`