Run any Skill in Manus with one click

$pwd:

dashboards-as-code

Name: Dashboards As Code
Author: MaterializeInc

// Use this skill when building, modifying, reviewing, or pushing Grafana dashboards under `packages/grafana-dashboards/` (Materialize observability dashboards generated from Python via `grafana-foundation-sdk` and `py-mzmon-lib`). Also use it when writing panel descriptions for those dashboards, picking palettes, or working through Materialize-specific PromQL patterns (cluster/replica filtering, peek latency, source/sink metrics, label-family quirks).

Run Skill in Manus

$ git log --oneline --stat

stars:1

forks:0

updated:May 23, 2026 at 23:16

File Explorer

4 files

SKILL.md

readonly

related-skills.json

same repository

helm-template-development.md

from "MaterializeInc/materialize-monitoring"

This skill should be used when making changes to `charts/*/templates/` or `charts/*/values.yaml` which are the underlying kubernetes resource templates and configurations used to generate resources in a helm release.

2026-05-231

yaml-development.md

from "MaterializeInc/materialize-monitoring"

This skill should be used when making changes to files with the `.yaml` or `.kyaml` extension.

2026-05-231

package.json

"author": "MaterializeInc"

"repository": "MaterializeInc/materialize-monitoring"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name

dashboards-as-code

description

Use this skill when building, modifying, reviewing, or pushing Grafana dashboards under `packages/grafana-dashboards/` (Materialize observability dashboards generated from Python via `grafana-foundation-sdk` and `py-mzmon-lib`). Also use it when writing panel descriptions for those dashboards, picking palettes, or working through Materialize-specific PromQL patterns (cluster/replica filtering, peek latency, source/sink metrics, label-family quirks).

Dashboards as Code

This skill is the entry point for the Materialize dashboards-as-code project. Stable conventions live in the repo docsite under docs/content/reference/internal/dashboard/ — this file is intentionally slim and links into the docsite at heading-level granularity. The non-link content below is the state snapshot: what currently exists, what's in flight, and what's queued for cleanup.

Audience reminder

The dashboards themselves target Materialize end users: database-literate operators with basic graph-reading fluency but minimal cloud / Kubernetes / observability expertise. SQL is fair game; jargon like "differential dataflow's arrangement" needs a one-liner explanation. Panel descriptions, titles, and cluster names should respect that baseline.

The docsite reference pages target repo contributors (SRE, Field Engineering, CloudOps, Database Engineers) and AI agents reading this skill.

Where to find what

Looking for…	Read
Grafana target versions, Dashboard v1/v2 schema state, SDK choices	SDKs and Schemas
Code structure, UID conventions, push process, `gcx dashboards update` vs ad-hoc v2 API	Generating and Pushing Dashboards
Palettes, layouts, panel visualization, panel description voice, PromQL conventions, label families, metric quirks, PromQL recipes, module-level constants table	Style Guidelines
Testing conventions (currently sparse)	Testing

Frequently needed deep links into the Style Guidelines:

And into Generating:

PUT body shape — required Kubernetes-style envelope when pushing v2 dashboards via grafana_api_request
Service account permissions — decoding 403s

Schema reference files

When uncertain about the exact shape Grafana expects, the cog-generated openapi schemas are bundled here:

references/dashboard.openapi.json — v1
references/dashboardv2beta1.openapi.json — v2beta1
references/dashboardv2.openapi.json — v2

All three generated from cog 61ff0a6055fa48f0c7b105fe4a37af637191314f (April 9, 2026).

Current Dashboard State

This section captures the live state of the dashboards in this repo so the next session has something concrete to start from. Update it when state changes meaningfully (new dashboard, new tab, retired panel, theme reassignment).

Dashboard inventory

Family	Dashboard module	Class	Live UID
`mz_environment`	`overview.overview_dashboard`	`EnvironmentOverviewDashboard`	(auto-assigned at first upload; codified UID is `mz-mon-env-top`, but the live one diverged before that became authoritative — see UID selection and behavior)

The mz_environment/overview dashboard has six tabs, in declared order:

#	Tab title	Module	Theme
1	Summary	`summary.py`	(no unique theme; uses health palette and themes from imports)
2	Kubernetes Workloads	`k8s_resources.py`	`K8S_THEME` = `palette.THEME_PALETTE[0]` (blue)
3	Cluster Objects / Replicas	`cluster_objects.py`	`CLUSTERS_THEME` = `palette.THEME_PALETTE[2]` (teal)
4	Connections / Activity	`connections_activity.py`	`CONNECTIONS_THEME` = `palette.THEME_PALETTE[1]` (cyan)
5	Compute Objects	`compute_objects.py`	`COMPUTE_THEME` = `palette.THEME_PALETTE[3]` (orange)
6	Storage Objects	`storage_objects.py`	`STORAGE_THEME` = `palette.THEME_PALETTE[4]` (yellow)

The Summary tab re-uses the KubeResourcesMixin's cpu_total_panel and memory_totals_panel, and also mirrors add_currently_hydrating_panel(...) from compute_objects.py in its Environment Health row.

Tab-by-tab row structure

Summary

Environment Health — Environment Status, Availability, Last Restart, Currently Hydrating (mirror), Current CPU Usage, Current Memory Usage
Environment Info — Materialize Version, Total CPU Capacity, Total Memory

Kubernetes Workloads

Resources Summary — Total CPU Capacity, Total Memory (includes monitoring)
Workload Readiness — Pod Readiness, StatefulSet Readiness, Deployment Readiness
Pod Metrics — Pod CPU Usage, Pod Memory Usage
Pod Networking — Rx, Tx, Errors, Packet Drops

Cluster Objects / Replicas

Cluster Summary — Cluster Count, Replica Count
Replication / Availability — Replica Sizes (donut), Replica AZs
Cluster Information — Cluster Information table

Connections / Activity

Connection Summary — Active Sessions, Active Queries, Adapter Command Rate
Queries — Distribution donut, Query Rate, Peek Latency p50/p90/p99 (3 separate panels)
Adapter Commands — Adapter Commands by Application table

Compute Objects

Compute Objects Summary — Active MV, Active Indexes, Active Views, Active Subscribes (donut), Index Types (donut)
Freshness — STUB row, no panels yet (placeholder title only)
Hydration — Currently Hydrating, Hydration Queue Size, Slowest Hydrating Collections (top-15 horizontal bar)
Dataflows — Dataflow Count, Dataflow Count (per worker), Dataflow Elapsed Rate (log scale)
Arrangements — Arrangement Rate, Arrangement Rate (per worker), 3 record-count tables (System / User / Transient)

Storage Objects

Storage Objects Summary — Active Sources, Active Sinks, Active Tables
Sources — Source Types donut, Sources by Status table, Source Bytes Received (rate)
Sinks — Sink Types donut, Sink Throughput, Sink Lag (staged minus committed)
Iceberg Sinks (collapsed by default) — Commit Latency p50/p90/p99, Commit Failures & Conflicts, File & Snapshot Rate
Kafka Sinks (collapsed by default) — TX Error Rate, Output Buffer, Connect / Disconnect Rate

Known stubs and orphans

compute_objects.py Freshness row — title-only, reserved for end-to-end freshness/lag metrics. Pick a freshness signal (mz_internal.mz_materialized_view_refreshes?) when filling it in.
dataflows.py — orphaned after Dataflows became a row inside Compute Objects rather than its own tab. Safe to delete; only referenced from overview_dashboard.py's import history (now removed).

Reference environments

Materialize developers may have access to an internal shared Grafana with multiple test environments. It can be useful to look at queries in live environments when building dashboards. Do not use environments without explicit permission.

Always scope investigative queries with materialize_cloud_organization_id="..." when testing — these are shared envs and you don't want to mix data across them.

Cleanup / refactor candidates

Tracked items that are working but could be tidier:

ENV_SCOPED_NOTE is duplicated in compute_objects.py and storage_objects.py. Lift to visualization.py (or a sibling _messages.py if it grows).
_COMPUTE_FILTER and _ARRANGEMENT_FILTER are the same string in two modules. Lift to a shared place; rename to something neutral like _LONGFORM_CLUSTER_FILTER.
dataflows.py is orphaned. Safe to rm.
The Compute Objects "Freshness" row is a title-only stub. Pick a freshness signal and fill it in (mz_materialized_view_lag_seconds in newer Materialize versions, or a derived metric from frontier metrics).
mz-mon- prefix isn't enforced in MzDashboard.UID values today (the class has UID = "env-top" and MzDashboard.__init__ prefixes it). Consistent across all current dashboards (one). Worth a validator if more dashboards land.

dashboards-as-code

More from this repository

More from this repository

Dashboards as Code

Audience reminder

Where to find what

Schema reference files

Current Dashboard State

Dashboard inventory

Tab-by-tab row structure

Known stubs and orphans

Reference environments

Cleanup / refactor candidates

Dashboards as Code

Audience reminder

Where to find what

Schema reference files

Current Dashboard State

Dashboard inventory

Tab-by-tab row structure

Known stubs and orphans

Reference environments

Cleanup / refactor candidates