Run any Skill in Manus with one click

$pwd:

grafana-dashboard

Name: Grafana Dashboard
Author: tangcent

// Build and modify Grafana dashboards following the user's established style conventions. Use when creating new dashboards, adding panels, modifying existing dashboards, or reviewing dashboard JSON for consistency. Requires the Grafana MCP to be enabled.

Run Skill in Manus

$ git log --oneline --stat

stars:1

forks:1

updated:March 23, 2026 at 13:52

SKILL.md

readonly

related-skills.json

same repository

create-skill.md

from "tangcent/ai-cookbook"

Create or update a skill in this repository under skills/<skill-name>. Use when the user asks to add a new skill, improve an existing skill, scaffold skill structure, or align a skill with repository standards.

2026-03-231

obsidian.md

from "tangcent/ai-cookbook"

Work with Obsidian vaults — create and edit Obsidian Flavored Markdown (.md), Bases (.base), JSON Canvas (.canvas), and interact with vaults via the Obsidian CLI. Use when working with Obsidian notes, wikilinks, callouts, embeds, properties, frontmatter, tags, bases, canvas files, or vault operations.

2026-03-231

gitlab-cli.md

from "tangcent/ai-cookbook"

Operate GitLab from terminal with `glab` for repositories, merge requests, issues, pipelines, releases, and API calls. Use for any GitLab task. For template-compliant MR creation, load reference workflow under references/.

2026-03-231

github-cli.md

from "tangcent/ai-cookbook"

Operate GitHub from terminal with `gh` for repositories, pull requests, issues, releases, actions, and API calls. Use for any GitHub task. For template-compliant PR or issue creation, load reference workflows under references/.

2026-03-231

git-workflow.md

from "tangcent/ai-cookbook"

Manage day-to-day Git workflow with safe branch creation, conventional commits, rebasing, pushing, MR creation, active-branch sync, and integration-branch consolidation. Use when the user asks to create branches, commit, push, sync feature/fix branches, rebuild dev branches, or open merge requests.

2026-03-231

learn-from-mistakes.md

from "tangcent/ai-cookbook"

Detect when the user is correcting the AI's approach or pointing out mistakes. Trigger phrases include 'you should', 'why not', 'that's wrong', 'use X instead', 'not like that', 'I said', 'no, do', 'wrong tool', 'wrong approach', 'I meant', 'actually', 'instead of', '不对', '应该用', '为什么不', '错了'. On activation, the AI reflects on what went wrong, identifies the correct approach, and logs the lesson to Obsidian for future reference.

2026-03-191

package.json

"author": "tangcent"

"repository": "tangcent/ai-cookbook"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	grafana-dashboard
description	Build and modify Grafana dashboards following the user's established style conventions. Use when creating new dashboards, adding panels, modifying existing dashboards, or reviewing dashboard JSON for consistency. Requires the Grafana MCP to be enabled.

Grafana Dashboard Skill

When to Activate

Activate this skill when the user asks to:

Create a new Grafana dashboard
Add or modify panels on an existing dashboard
Review a dashboard for style consistency
Build monitoring dashboards for a service or component
Generate dashboard JSON or PromQL queries for Grafana

Pre-Flight Check — Grafana MCP Required

Before performing any Grafana operation, verify the Grafana MCP is available by attempting a lightweight call (e.g. search_dashboards).

If the Grafana MCP is not available or not responding, stop and tell the user:

The Grafana MCP server is not enabled. To use this skill you need to configure it.

Install mcp-grafana — see mcp/grafana.md in this repo for instructions.

Add the server config to .kiro/settings/mcp.json (or your user-level config).

Reconnect MCP servers from the Kiro MCP panel.

Once the Grafana MCP is connected, try again.

Do not attempt to generate raw JSON without the MCP — the skill relies on live queries to discover metrics, labels, and datasource UIDs.

Dashboard Style Guide — User Preferences

The following conventions are extracted from the user's existing dashboards. All new dashboards and panels MUST follow these rules.

1. Overall Structure

Folder: Place dashboards in the appropriate project folder. Ask the user if unsure.
Tags: Every dashboard MUST have descriptive, lowercase, hyphenated tags (e.g. http, performance, tracking, business, sdk, grpc, kafka). Tags should cover: the domain, the component, and the dashboard purpose.
Project/application identity tags are mandatory on new dashboards: When creating a dashboard, always add the project name and application name as tags, normalized to lowercase hyphenated form. Example: project Ads Delivery Platform -> tag ads-delivery-platform; application tracking-api -> tag tracking-api.
Do not create a new dashboard until those identity tags are known: Infer them from the user's request, folder context, datasource labels, or existing related dashboards. If still unclear, ask the user for the project name and application name before creating the dashboard skeleton.
Description: Every dashboard MUST have a description field — one sentence summarizing what it monitors and for which service.
Default time range: now-6h to now for business/performance dashboards; now-1h to now for operational/real-time dashboards.
Auto-refresh: Set refresh: "30s" only for operational dashboards (queues, pipelines). Omit for analytical dashboards.

2. Template Variables

Always include a cluster variable:
- Type: query
- Datasource: Prometheus (uid M2qX4-R4k)
- Query pattern: label_values(<a_relevant_metric>{...}, cluster)
- includeAll: true, allValue: ".*"
Include an app variable when the dashboard covers multiple apps:
- Query pattern: label_values(<metric>{kubernetes_namespace="<ns>"}, app)
- includeAll: false
Additional filter variables (e.g. method_filter, brand, pod) as custom or query types when needed.
Variable queries use label_values(...) syntax, refresh on dashboard load (refresh: 1).

3. Layout — Row & Panel Organization

The dashboard follows a top-down drill-down pattern:

Overview row (collapsed: false) — a horizontal strip of stat panels showing KPIs at a glance.
Timeseries rows — key trend charts (QPS, latency, error rate) directly below the overview.
Detail / Drill-down rows (collapsed: true) — per-endpoint, per-status, per-dimension breakdowns.
Error Detail row (collapsed: true) — error-specific panels.
Supplementary rows (collapsed: true) — version adoption, runtime health, etc.

Row IDs use round numbers with spacing for future inserts: 100, 200, 300, 400, 500, 600.

CRITICAL: Every Row Must Fill the Full 24-Column Width — No Orphan Panels

Grafana uses a 24-column grid. Every horizontal line of panels MUST sum to exactly 24 columns. Never leave empty space at the end of a row.

The core principle: plan the full row first, then assign widths.

Decide which panels belong on the same horizontal line.
Count them.
Divide 24 by that count and distribute widths so they sum to exactly 24.
If a panel would be alone or leave a gap, widen it to fill the remaining space, or pull it up to join the previous row (adjusting widths of that row to accommodate).

Handling orphan panels (the most common mistake):

An "orphan" is a panel that ends up alone (or with too few siblings) on a row, leaving large empty space. This looks bad:

BAD:  [■■■][                     ]   ← 1 stat alone on a row (w:3, 21 cols wasted)
BAD:  [■■■■■■■■■■■■][            ]   ← 1 timeseries alone (w:12, 12 cols wasted)

Fix orphan panels using one of these strategies (in order of preference):

Widen to fill: A single stat → w:24. A single timeseries → w:24. Two stats → w:12 each. Three stats → w:8 each.
Merge into adjacent row: Move the orphan panel onto the same horizontal line as related panels above or below, and redistribute widths. For example, 1 orphan stat + 2 timeseries → [stat w:4][timeseries w:10][timeseries w:10] = 24.
Pair with a related timeseries: A stat + its corresponding timeseries on the same row → [stat w:4][timeseries w:20] = 24, or [stat w:3][timeseries w:21] = 24.

Never place a panel that occupies less than the full 24 columns on a row by itself.

Stat panel row sizing — pick the width that fills the row:

Stat count per row	Width each	Layout	Total
8	`w: 3`	`3×8`	24 ✓
6	`w: 4`	`4×6`	24 ✓
5	`w: 5,5,5,5,4`	mixed	24 ✓
4	`w: 6`	`6×4`	24 ✓
3	`w: 8`	`8×3`	24 ✓
2	`w: 12`	`12×2`	24 ✓
1	`w: 24`	full-width	24 ✓
7	`w: 3,3,3,3,4,4,4`	mixed	24 ✓

When the count doesn't divide evenly into 24, widen the last few panels to absorb the remainder. For example, 7 stats → four w:3 + three w:4 = 24.

Multi-row stat blocks (e.g. 9+ stats): fill complete rows of 8 (w:3) first, then apply the table above to the remainder row. Example: 9 stats → row 1: eight w:3, row 2: one stat w:3 merged with related timeseries to fill 24, or widened to w:24 if standalone.

Multi-row stat blocks by group (e.g. Outbound API with 12 stats grouped by protocol): arrange as multiple full rows. Example: 6 stats per row × 2 rows, each w:4.

Timeseries panel row sizing:

Panel count per row	Width each	Total
1	`w: 24`	24 ✓
2	`w: 12`	24 ✓
3	`w: 8`	24 ✓

A single timeseries MUST be w:24 — never w:12 alone on a row.

Pie + Timeseries paired rows (used in Validation dashboards):

Each dimension gets a pie chart + timeseries side by side. Two pairs per row: [pie w:4] [timeseries w:8] [pie w:4] [timeseries w:8] = 24 ✓

One pair alone on a row: [pie w:6] [timeseries w:18] = 24 ✓ (widen to fill)

Stat + Timeseries mixed rows (used in Pipeline dashboards):

When combining stat cards with a large timeseries in the same row, ensure they tile to 24: [stat w:4] [stat w:3] [timeseries w:17] = 24 ✓

Bar gauge / Bar chart panels: Use w:12 (half, paired with another panel) or w:24 (full) to fill the row.

Self-check before finalizing any dashboard: Scan every y coordinate. For each unique y, sum the w values of all panels at that y. If any sum is less than 24, fix it before saving.

CRITICAL: No Uncovered Vertical Space Next to a Taller Panel

When panels on the same row have different heights, the shorter panel(s) must collectively fill the full column height — otherwise a blank gap appears below them.

BAD:  [stat h:4          ][timeseries h:8]
      [     gap h:4       ]
      ← stat leaves 4 rows of blank space below it

GOOD: [stat h:8          ][timeseries h:8]   ← single stat stretched to match
GOOD: [stat h:4][stat h:4][timeseries h:8]   ← two stats stacked (y and y+4) fill h:8 together

Rule: For any column of panels adjacent to a taller panel, the sum of h values in that column MUST equal the height of the tallest panel in the row.

Concretely, when pairing stat(s) with a timeseries (h:8):

1 stat alone → set h:8
2 stats stacked → each h:4 (4+4=8 ✓)
3 stats stacked → not recommended; use h:3,h:3,h:2 or restructure

4. Stat Panels (KPI Cards)

Panel type: stat
Color mode: background_solid
Graph mode: none
Reduce calc: lastNotNull
Height: h: 4 always.
Width: Determined by the row-fill rule above — typically w: 3 (8 per row), w: 4 (6 per row), or w: 5 (≈5 per row with mixed widths). Choose the width that makes the row sum to exactly 24.
Every stat panel MUST have a description explaining what it shows and (for threshold-based ones) the threshold meaning.

Standard Threshold Conventions

Metric Type	Unit	Thresholds (color steps)
Count / Total	`short`	Single step: `blue` at base
Success Rate	`percentunit`	`green` (base) → `yellow` at 0.95 → `red` below 0.95 (inverted: higher is better)
Error Rate	`percentunit`	`green` (base) → `orange` at 0.01 → `red` at 0.05
Mean Latency	`ms`	`green` (base) → `yellow` at 50 → `red` at 100
P95 Latency	`ms`	`green` (base) → `yellow` at 100 → `red` at 200
Kafka Success Rate	`percentunit`	`green` (base) → `yellow` at 0.99 → `red` below 0.99
Flagged / Blocked %	`percentunit`	`green` (base) → `orange` at 0.05 → `red` at 0.10
Queue depth / Pending	`short`	`green` (base) → `yellow` at warning → `red` at critical (context-dependent)

5. Timeseries Panels

Height: h: 8 always.
Width: Must fill the full 24-column row — w: 12 (2 per row), w: 8 (3 per row), or w: 24 (full-width). Never use a width that leaves a gap.
Draw style: line
Line interpolation: smooth
Line width: 1
Fill opacity: 15
Gradient mode: opacity
Point size: 5, show points: auto
Stacking: none (unless explicitly showing stacked throughput by status)
Tooltip: multi with sort desc for multi-query panels; single for single-query panels.
Legend: list placement bottom. Include calcs: ["max", "mean"] for latency panels; omit calcs for throughput panels.
Scale: Use symlog (log base 10) for latency panels to handle spikes. Use linear for throughput/QPS panels.

Latency Panels

Show P99, P95, P90, P50 as separate queries in one panel.
Use histogram_quantile() with sum(rate(..._bucket{...}[1m])) by (le).
Multiply by 1000 to convert seconds → milliseconds. Unit: ms.
Use overrides with byRegexp matcher to set display names: P99, P95, P90, P50.

Throughput Panels

Use sum(rate(..._count{...}[1m])) for QPS timeseries.
Use sum(rate(...[1m])) by (status) to break down by status.
Rate interval: 1m for timeseries panels, 5m for stat panel averages.

6. Pie Chart Panels

Used for percentage distribution breakdowns (e.g. "Issues — % by Issue Type").
Always paired with a corresponding timeseries "Rate by ..." panel.
Query uses sum(increase(...[$__range])) by (dimension).
Title pattern: <Metric> — % by <Dimension>.

7. Bar Gauge / Bar Chart Panels

Used for Top N rankings (e.g. "Top 10 Slowest Endpoints", "Top 10 Issue Types").
Query uses topk(N, ...).
Used for categorical breakdowns (e.g. "Failure Codes (1h)") with sum by (code) (increase(...[1h])).

8. PromQL Conventions

Always filter by cluster=~"$cluster" in every query.
Always filter by app="$app" when the app variable exists.
For stat panels showing totals over the selected range: use sum(increase(...[$__range])).
For stat panels showing averages: use avg_over_time(sum(rate(...[5m]))[$__range]).
For stat panels showing ratios: use sum(increase(...{success}[$__range])) / clamp_min(sum(increase(...[$__range])), 1).
- Always use clamp_min(..., 1) on denominators to avoid division by zero. Exception: when the denominator is guaranteed non-zero (e.g. total requests in a high-traffic service), a plain division is acceptable.
For timeseries rate panels: use $__rate_interval or 1m as the rate window.
Latency from histogram: histogram_quantile(0.XX, sum(rate(..._bucket{...}[interval])) by (le)) * 1000.
Mean latency: sum(rate(..._sum{...}[interval])) / sum(rate(..._count{...}[interval])) * 1000.
Metric naming convention observed: <namespace>_<component>_<metric_name>_<unit> (e.g. as_sdk_tracking_sdk_tracking_request_seconds_count).

9. Panel Descriptions

Every panel MUST have a description field. Follow this pattern:

For stat panels: State what the number represents, and if thresholds apply, describe them inline.
- Example: "Average request latency across all endpoints over the selected time range. Green < 50ms, yellow < 100ms, red >= 100ms."
For timeseries panels: State what the chart shows and how it's computed.
- Example: "Aggregate HTTP request throughput across all endpoints, computed as a 1-minute rate."
For pie/bar panels: State the dimension being broken down.
- Example: "Top Pixel validation issue types over the selected time range."

10. Panel Title Naming Conventions

Stat panels: Short noun phrases — Total Events, Mean QPS, Success Rate, P95 Latency, Error Rate.
Timeseries panels: Descriptive — Event Throughput by Status (QPS), Request Latency, Non-Success Request Rate.
Paired pie+timeseries: <Subject> — % by <Dim> and <Subject> — Rate by <Dim>.
Row titles: Functional groupings — Overview, Detail, Error Detail, Queue & Throughput, Latency & Audit Quality, Runtime Health.

11. Dashboard Patterns by Category

HTTP Service Performance Dashboard

Overview stats: Total Requests → Mean QPS → Error Rate → Active Requests → Mean Latency → P95 → P90 → P50. Timeseries: QPS → Latency Percentiles → Top 10 Slowest Endpoints → Error Rate → Volume by Endpoint. Drill-down rows: Detailed URI Performance, Error Details.

Business Metrics Dashboard

Overview stats: Total Events → Mean QPS → Success Rate → Mean Latency → P95 Latency → HTTP Error Rate → Kafka Success Rate → Flagged %. Timeseries: Event Throughput by Status → Avg Latency. Drill-down rows: Regional Breakdown, Detail, Error Detail, SDK Version Adoption.

Outbound API Performance Dashboard

Overview stats: grouped by protocol (gRPC / HTTP Client / Feign) — each group has Total Calls → Success Rate → QPS → Mean Latency. Detail rows: one per protocol + Kafka Producer.

Pipeline / Queue Dashboard

Overview stats: queue depths (Pending, Processing, Dispatched) → resource counts (Engines Registered, Available) → throughput stats (Request Rate, Success Rate, Avg Latency). Timeseries: Pipeline State Over Time (multiple gauge queries overlaid). Detail rows: one per pipeline stage (Ingestion, Dispatch, Engine Pool, Result Processing, Callback, System Health).

Validation / Quality Dashboard

Overview stats: Total Events → Invalid Ratio → Rejected % → Flagged %. Sections per event source (SDK / Pixel): Total Issues → Avg Rate → Top 10 bar gauge → paired (pie + timeseries) for each dimension.

12. Incremental Dashboard Building — Avoid Large JSON

NEVER send a full dashboard JSON in a single update_dashboard call unless creating a brand-new empty dashboard skeleton. Large JSON payloads are expensive on context, error-prone, and hard to debug.

Instead, use incremental patch operations to build and modify dashboards piece by piece:

Creating a New Dashboard

Create a minimal skeleton first — just the title, description, tags, variables, and time range. No panels.
Confirm the skeleton tags include both normalized identity tags: <project-name> and <application-name>.
Append panels one row at a time using patch add operations.

Step 1: collect or infer the project name and application name, normalize them to lowercase hyphenated tags
Step 2: update_dashboard with minimal dashboard JSON (no panels, just metadata + templating), ensuring `tags` includes both identity tags plus any domain/purpose tags
Step 3: update_dashboard with uid + operations: [{"op": "add", "path": "$.panels/- ", "value": <row panel>}]
Step 4: update_dashboard with uid + operations: [{"op": "add", "path": "$.panels/- ", "value": <stat panel 1>}]
...repeat for each panel or small batch of panels

Modifying an Existing Dashboard

Use uid + operations (patch mode) to make targeted changes:

Task	Patch operation
Change a panel title	`{"op": "replace", "path": "$.panels[2].title", "value": "New Title"}`
Update a query	`{"op": "replace", "path": "$.panels[2].targets[0].expr", "value": "new_promql{...}"}`
Add a panel at the end	`{"op": "add", "path": "$.panels/- ", "value": <panel JSON>}`
Remove a panel	`{"op": "remove", "path": "$.panels[5]"}`
Update dashboard description	`{"op": "replace", "path": "$.description", "value": "..."}`
Change time range	`{"op": "replace", "path": "$.time", "value": {"from": "now-6h", "to": "now"}}`

Rules

Prefer multiple small patch calls over one large JSON call. Each call should touch at most a few panels.
When adding multiple panels, batch them into logical groups (e.g. all stat panels for one overview row in one call, then the timeseries row in the next call).
Always use get_dashboard_summary or get_dashboard_property to inspect the current state before patching — don't guess panel indices.
When removing multiple panels, the MCP auto-reorders indices to avoid shift issues, but still verify with get_dashboard_summary after bulk removes.

Workflow

Understand the requirement: Ask the user what project and application the dashboard belongs to, what service/component it is for, what metrics are available, and what the monitoring goal is.
Discover metrics: Use list_prometheus_metric_names and list_prometheus_label_values to find available metrics and labels for the target service.
Pick a dashboard pattern from the categories above (or combine patterns).
Build the required tag set: Normalize the project name and application name into lowercase hyphenated tags, then add any extra domain/purpose tags needed for discoverability.
Create the dashboard skeleton: Use update_dashboard with a minimal JSON — title, description, tags, templating variables, time range. No panels yet. The tags array must include both the project tag and the application tag.
Add panels incrementally: Use update_dashboard with uid + operations (patch mode) to append panels row by row — overview stats first, then timeseries, then drill-down rows.
Verify: Use get_dashboard_summary to confirm the dashboard structure is correct, including the required identity tags.
Share: Use generate_deeplink to give the user a direct link.

grafana-dashboard

More from this repository

More from this repository

Grafana Dashboard Skill

When to Activate

Pre-Flight Check — Grafana MCP Required

Dashboard Style Guide — User Preferences

1. Overall Structure

2. Template Variables

3. Layout — Row & Panel Organization

CRITICAL: Every Row Must Fill the Full 24-Column Width — No Orphan Panels

CRITICAL: No Uncovered Vertical Space Next to a Taller Panel

4. Stat Panels (KPI Cards)

Standard Threshold Conventions

5. Timeseries Panels

Latency Panels

Throughput Panels

6. Pie Chart Panels

7. Bar Gauge / Bar Chart Panels

8. PromQL Conventions

9. Panel Descriptions

10. Panel Title Naming Conventions

11. Dashboard Patterns by Category

HTTP Service Performance Dashboard

Business Metrics Dashboard

Outbound API Performance Dashboard

Pipeline / Queue Dashboard

Validation / Quality Dashboard

12. Incremental Dashboard Building — Avoid Large JSON

Creating a New Dashboard

Modifying an Existing Dashboard

Rules

Workflow

Grafana Dashboard Skill

When to Activate

Pre-Flight Check — Grafana MCP Required

Dashboard Style Guide — User Preferences

1. Overall Structure

2. Template Variables

3. Layout — Row & Panel Organization

CRITICAL: Every Row Must Fill the Full 24-Column Width — No Orphan Panels

CRITICAL: No Uncovered Vertical Space Next to a Taller Panel

4. Stat Panels (KPI Cards)

Standard Threshold Conventions

5. Timeseries Panels

Latency Panels

Throughput Panels

6. Pie Chart Panels

7. Bar Gauge / Bar Chart Panels

8. PromQL Conventions

9. Panel Descriptions

10. Panel Title Naming Conventions

11. Dashboard Patterns by Category

HTTP Service Performance Dashboard

Business Metrics Dashboard

Outbound API Performance Dashboard

Pipeline / Queue Dashboard

Validation / Quality Dashboard

12. Incremental Dashboard Building — Avoid Large JSON

Creating a New Dashboard

Modifying an Existing Dashboard

Rules

Workflow