with one click
whizard-telemetry-ruler
// Use when working with WizTelemetry Ruler extension for KubeSphere, including installation, configuration, alerting rules management
// Use when working with WizTelemetry Ruler extension for KubeSphere, including installation, configuration, alerting rules management
Operate the KubeSphere network extension. Use when Codex needs to install, upgrade, configure, enable, disable, or inspect the `network` extension; manage Calico `IPPool` resources, namespace bindings, migrations, or network isolation flows; or consult the bundled network extension references in this skill.
Use when working with KubeSphere DevOps extension, CI/CD pipelines, Jenkins integration, or pipeline troubleshooting
Use when creating, running, or managing CI/CD pipelines in KubeSphere DevOps, including pipeline API operations and run monitoring
Use when working with WizTelemetry Logging extension for KubeSphere, including installation, configuration, and log query API
KubeSphere Fluid management Skill. Use when user asks to install or enable Fluid, check Fluid status, view Fluid pods/logs/CRDs, create or update Dataset, AlluxioRuntime, JuiceFSRuntime, or ThinRuntime, perform DataLoad or cache warming, scale runtime, or troubleshoot Fluid issues in KubeSphere.
Use when managing credentials in KubeSphere DevOps, including repository credentials, kubeconfig, and API tokens
| name | whizard-telemetry-ruler |
| description | Use when working with WizTelemetry Ruler extension for KubeSphere, including installation, configuration, alerting rules management |
WizTelemetry Ruler is an extension component in the KubeSphere Observability Platform that provides event alerting and log alerting capabilities. It can define alerting rules for K8s native events, K8s/KubeSphere auditing events, and K8s logs, evaluate incoming event data and log data, and send alerts to specified receivers such as alertmanager, etc.
| Component | Description | Default Enabled |
|---|---|---|
| whizard-telemetry-ruler | Core ruler component for alerting | true |
REQUIRED: Complete all steps in order before generating InstallPlan.
⚠️ CRITICAL: DO NOT proceed until target clusters are determined.
Step 1.1: Get available clusters
kubectl get clusters -o jsonpath='{.items[*].metadata.name}'
Step 1.2: Determine target clusters
Ask user (if not specified):
Available clusters: host, dev
Which clusters do you want to deploy WizTelemetry Ruler to?
MUST do this to get the latest version:
kubectl get extensionversions -n kubesphere-system -l kubesphere.io/extension-ref=whizard-telemetry-ruler -o jsonpath='{range .items[*]}{.spec.version}{"\n"}{end}' | sort -V | tail -1
This outputs the latest version (e.g., 1.5.0). Note this down - you'll use it in the InstallPlan.
Only perform this step if you need to configure sink for alert notifications.
The AlertManager proxy service (alertmanager-proxy) is deployed in the host cluster and exposed via NodePort (default port: 31093).
Step 3.1: Get a host node IP
kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}'
Step 3.2: Confirm with user
Ask user to confirm the AlertManager host IP:
Detected AlertManager host: <NODE_IP>
Detected AlertManager port: 31093
Alert URL: http://<NODE_IP>:31093/api/v1/alerts
Do you want to use this URL for alert notifications?
http://<NODE_IP>:31093/api/v1/alerts as the sink URLNote: If using WizTelemetry Notification extension, ensure it is installed before installing WizTelemetry Ruler.
⚠️ IMPORTANT: Complete prerequisite steps BEFORE this step.
Based on your selections:
⚠️ CRITICAL: InstallPlan metadata.name MUST be whizard-telemetry-ruler. DO NOT use any other name.
⚠️ CRITICAL: config field is YAML format. You MUST:
⚠️ CRITICAL: All placeholders MUST be replaced with actual values. DO NOT leave them as placeholders.
apiVersion: kubesphere.io/v1alpha1
kind: InstallPlan
metadata:
name: whizard-telemetry-ruler
namespace: kubesphere-system
spec:
extension:
name: whizard-telemetry-ruler
version: <VERSION> # From Step 2
enabled: true
upgradeStrategy: Manual
config: |
whizard-telemetry-ruler:
config:
sinks:
- name: alertmanager
type: webhook
config:
url: http://<ALERT_MANAGER_HOST>:31093/api/v1/alerts # From Step 3
clusterScheduling:
placement:
clusters:
- <TARGET_CLUSTERS>
apiVersion: kubesphere.io/v1alpha1
kind: InstallPlan
metadata:
name: whizard-telemetry-ruler
namespace: kubesphere-system
spec:
extension:
name: whizard-telemetry-ruler
version: <VERSION> # From Step 2
enabled: true
upgradeStrategy: Manual
config: |
whizard-telemetry-ruler:
auditingAlerting:
enabled: true
eventsAlerting:
enabled: true
loggingAlerting:
enabled: false
config:
sinks:
- name: alertmanager
type: webhook
config:
url: http://<ALERT_MANAGER_HOST>:31093/api/v1/alerts # From Step 3
clusterScheduling:
placement:
clusters:
- <TARGET_CLUSTERS>
Replace placeholders:
<VERSION>: From Step 2 (e.g., 1.5.0)<TARGET_CLUSTERS>: User-confirmed cluster names<ALERT_MANAGER_HOST>: From Step 3 (auto-detected or user-confirmed node IP)apiVersion: kubesphere.io/v1alpha1
kind: InstallPlan
metadata:
name: whizard-telemetry-ruler
namespace: kubesphere-system
spec:
extension:
name: whizard-telemetry-ruler
version: <VERSION> # From Step 2
enabled: true
upgradeStrategy: Manual
config: |
whizard-telemetry-ruler:
auditingAlerting:
enabled: true
eventsAlerting:
enabled: true
loggingAlerting:
enabled: true
config:
sinks:
- name: alertmanager
type: webhook
config:
url: http://<ALERT_MANAGER_HOST>:31093/api/v1/alerts # From Step 3
clusterScheduling:
placement:
clusters:
- <TARGET_CLUSTERS>
apiVersion: kubesphere.io/v1alpha1
kind: InstallPlan
metadata:
name: whizard-telemetry-ruler
namespace: kubesphere-system
spec:
extension:
name: whizard-telemetry-ruler
version: <VERSION> # From Step 2
enabled: true
upgradeStrategy: Manual
config: |
global:
alertingPersistence:
enabled: true
whizard-telemetry-ruler:
config:
sinks:
- name: alertmanager
type: webhook
config:
url: http://<ALERT_MANAGER_HOST>:31093/api/v1/alerts # From Step 3
alerting-persistence:
sinks:
opensearch:
enabled: true
clusterScheduling:
placement:
clusters:
- <TARGET_CLUSTERS>
| Parameter | Type | Default | Description |
|---|---|---|---|
whizard-telemetry-ruler.auditingAlerting.enabled | bool | true | Enable auditing alert |
whizard-telemetry-ruler.eventsAlerting.enabled | bool | true | Enable events alert |
whizard-telemetry-ruler.loggingAlerting.enabled | bool | false | Enable log alert |
| Parameter | Type | Default | Description |
|---|---|---|---|
whizard-telemetry-ruler.config.sinks[].name | string | Sink name | |
whizard-telemetry-ruler.config.sinks[].type | string | Sink type (webhook, etc.) | |
whizard-telemetry-ruler.config.sinks[].config.url | string | Webhook URL |
| Parameter | Type | Default | Description |
|---|---|---|---|
global.alertingPersistence.enabled | bool | false | Enable alert persistence |
alerting-persistence.sinks.opensearch.enabled | bool | false | Enable OpenSearch sink for alerts |
alerting-persistence.sinks.opensearch.ism_policy.enable | bool | true | Enable ISM policy |
alerting-persistence.sinks.opensearch.ism_policy.min_index_age | string | "7d" | Minimum index retention period |
| Parameter | Type | Default | Description |
|---|---|---|---|
whizard-telemetry-ruler.resources.limits.cpu | string | 2 | ruler CPU limit |
whizard-telemetry-ruler.resources.limits.memory | string | 4Gi | ruler memory limit |
whizard-telemetry-ruler.resources.requests.cpu | string | 100m | ruler CPU request |
whizard-telemetry-ruler.resources.requests.memory | string | 20Mi | ruler memory request |
whizard-telemetry-ruler.kubectl.resources.limits.cpu | string | 100m | kubectl CPU limit |
whizard-telemetry-ruler.kubectl.resources.limits.memory | string | 256Mi | kubectl memory limit |
whizard-telemetry-ruler.kubectl.resources.requests.cpu | string | 100m | kubectl CPU request |
whizard-telemetry-ruler.kubectl.resources.requests.memory | string | 256Mi | kubectl memory request |
| Parameter | Type | Default | Description |
|---|---|---|---|
whizard-telemetry-ruler.nodeSelector | map | {} | Node selector |
whizard-telemetry-ruler.tolerations | list | [] | Tolerations |
whizard-telemetry-ruler.affinity | map | {} | Affinity |
curl -X GET "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/namespaces/<namespace>/rulegroups?clusterName=host" \
-H "X-Remote-User: admin"
curl -X GET "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/namespaces/<namespace>/rulegroups/<name>?clusterName=host" \
-H "X-Remote-User: admin"
curl -X POST "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/namespaces/<namespace>/rulegroups?clusterName=host" \
-H "X-Remote-User: admin" \
-H "Content-Type: application/json" \
-d '{
"apiVersion": "logging.whizard.io/v1alpha1",
"kind": "RuleGroup",
"metadata": {
"name": "<rulegroup-name>",
"namespace": "<namespace>"
},
"spec": {
"type": "events",
"rules": [
{
"name": "test-rule",
"desc": "Test rule",
"enable": true,
"expr": {
"kind": "rule",
"condition": "reason == \"FailedCreatePodSandBox\""
},
"alerts": {
"severity": "warning",
"message": "Pod sandbox creation failed",
"labels": {
"alert": "test"
},
"annotations": {
"summary": "Pod sandbox creation failed"
}
}
}
]
}
}'
curl -X PUT "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/namespaces/<namespace>/rulegroups/<name>?clusterName=host" \
-H "X-Remote-User: admin" \
-H "Content-Type: application/json" \
-d '<UPDATED_RULEGROUP>'
curl -X DELETE "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/namespaces/<namespace>/rulegroups/<name>?clusterName=host" \
-H "X-Remote-User: admin"
curl -X GET "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/clusterrulegroups?clusterName=host" \
-H "X-Remote-User: admin"
curl -X GET "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/clusterrulegroups/<name>?clusterName=host" \
-H "X-Remote-User: admin"
curl -X POST "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/clusterrulegroups?clusterName=host" \
-H "X-Remote-User: admin" \
-H "Content-Type: application/json" \
-d '{
"apiVersion": "logging.whizard.io/v1alpha1",
"kind": "ClusterRuleGroup",
"metadata": {
"name": "<clusterrulegroup-name>"
},
"spec": {
"type": "auditing",
"rules": [
{
"name": "audit-rule",
"desc": "Audit rule",
"enable": true,
"expr": {
"kind": "rule",
"condition": "verb == \"delete\""
},
"alerts": {
"severity": "error",
"message": "Delete operation detected",
"labels": {
"type": "audit"
}
}
}
]
}
}'
curl -X DELETE "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/logging.whizard.io/v1alpha1/clusterrulegroups/<name>?clusterName=host" \
-H "X-Remote-User: admin"
Query alerts with filters and time range:
curl -X POST "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/events.alerting.wiztelemetry.io/v1alpha1/query" \
-H "X-Remote-User: admin" \
-H "Content-Type: application/json" \
-d '{
"cluster": "host",
"startTime": 1704067200,
"endTime": 1704153600,
"from": 0,
"size": 10,
"order": "descending",
"parameters": [
{
"field": "severity",
"operator": "=",
"value": "error"
},
{
"field": "alertname",
"operator": "?",
"values": ["pod*"]
}
]
}'
Get alert statistics (overview, histogram, by severity, etc.):
curl -X POST "http://whizard-telemetry-apiserver.extension-whizard-telemetry.svc:80/kapis/events.alerting.wiztelemetry.io/v1alpha1/statistics" \
-H "X-Remote-User: admin" \
-H "Content-Type: application/json" \
-d '{
"cluster": "host",
"statisticsType": 501,
"startTime": 1704067200,
"endTime": 1704153600
}'
| Parameter | Type | Required | Description |
|---|---|---|---|
cluster | string | Yes | Cluster name (e.g., host, member-1) |
startTime | int64 | No | Start time (Unix timestamp), default: 30 days ago |
endTime | int64 | No | End time (Unix timestamp), default: now |
from | int64 | No | Offset for pagination, default: 0 |
size | int64 | No | Number of results, default: 10 |
order | string | No | Sort order: "ascending" or "descending", default: descending |
statisticsType | int | No | Statistics type (see Statistics Type Values) |
parameters | array | No | Filter parameters |
| Field | Type | Description |
|---|---|---|
field | string | Field name to filter on |
operator | string | Operator (see Filter Operators) |
value | interface{} | Single value for =, !=, >, >=, <, <= |
values | array | Multiple values for In, NotIn, ?, !?, |
| Operator | Symbol | Description |
|---|---|---|
| Equals | = | Exact match |
| NotEquals | != | Not equal |
| Greater | > | Greater than |
| GreaterOrEqual | >= | Greater or equal |
| Less | < | Less than |
| LessOrEqual | <= | Less or equal |
| In | In | In list |
| NotIn | NotIn | Not in list |
| MatchesFuzzy | ? | Fuzzy match (supports * and ?) |
| NotMatchesFuzzy | !? | Not fuzzy match |
| MatchesRegex | ~ | Regex match |
| NotMatchesRegex | !~ | Not regex match |
| Exists | Exists | Field exists |
| NotExists | NotExists | Field does not exist |
| Type | Value | Description |
|---|---|---|
| StatisticsEventsAlertingNone | 500 | No statistics |
| StatisticsEventsAlertingDateHistogram | 501 | Time histogram |
| StatisticsEventsAlertingOverview | 502 | Overview count |
| StatisticsEventsAlertingByNamespace | 503 | By namespace |
| StatisticsEventsAlertingByRuleGroup | 504 | By rule group |
| StatisticsEventsAlertingByAlertName | 505 | By alert name |
| StatisticsEventsAlertingByAlertType | 506 | By alert type |
| StatisticsEventsAlertingBySeverity | 507 | By severity |
| Field | Type | Description |
|---|---|---|
alertname | string | Alert name |
severity | string | Alert severity (info, warning, error, critical) |
namespace | string | Namespace |
rulegroup | string | Rule group name |
cluster | string | Cluster name |
rulekind | string | Rule kind (RuleGroup, ClusterRuleGroup) |
ruletype | string | Rule type (events, auditing, logs) |
alerttype | string | Alert type |
labels | string | Alert labels (JSON string) |
annotations | string | Alert annotations (JSON string) |
firing | bool | Is firing |
pending | bool | Is pending |
inhibited | bool | Is inhibited |
silenced | bool | Is silenced |
startsat | int64 | Start timestamp |
endsat | int64 | End timestamp |
updatedat | int64 | Update timestamp |
{
"cluster": "host",
"parameters": [
{
"field": "severity",
"operator": "=",
"value": "error"
}
],
"size": 20
}
{
"cluster": "host",
"startTime": 1704067200,
"endTime": 1704153600,
"size": 100
}
{
"cluster": "host",
"parameters": [
{
"field": "namespace",
"operator": "=",
"value": "default"
},
{
"field": "rulegroup",
"operator": "=",
"value": "my-rule-group"
}
]
}
{
"cluster": "host",
"parameters": [
{
"field": "alertname",
"operator": "?",
"values": ["pod*", "container*"]
}
]
}
{
"cluster": "host",
"startTime": 1704067200,
"endTime": 1704153600,
"statisticsType": 507
}
| Parameter | Type | Description |
|---|---|---|
cluster | string | Cluster name, empty means host cluster |
name | string | Name used for filtering |
labelSelector | string | Label selector used for filtering |
status | string | Filter by enabled status (true or false) |
builtin | string | Filter by builtin status (true or false) |
type | string | Filter by type (logs, events, auditing) |
page | int | Page number |
limit | int | Items per page |
orderBy | string | Sort parameter (e.g., createTime) |
ascending | bool | Sort order |
| Type | Description |
|---|---|
events | K8s native events |
auditing | K8s/KubeSphere auditing events |
logs | K8s container logs |
| Severity | Description |
|---|---|
info | Informational |
warning | Warning |
error | Error |
critical | Critical |
| Kind | Description |
|---|---|
rule | Regular rule with condition |
macro | Macro rule |
list | List rule |
alias | Alias rule |
The condition field is used to filter events/logs/auditing that match specific criteria. It supports various operators and field references.
| Operator | Description | Example |
|---|---|---|
== | Equals | reason == "FailedCreatePodSandBox" |
!= | Not equals | verb != "delete" |
=~ | Regex match | message =~ "error.*failed" |
!~ | Not regex match | message !~ "debug" |
&& | AND | reason == "Failed" && type == "Warning" |
|| | OR | reason == "Failed" || reason == "Error" |
type: events)| Field | Type | Description |
|---|---|---|
reason | string | Event reason (e.g., FailedCreatePodSandBox) |
type | string | Event type (Normal, Warning) |
involvedObject.kind | string | Object kind (Pod, Deployment, etc.) |
involvedObject.name | string | Object name |
involvedObject.namespace | string | Object namespace |
message | string | Event message |
source | string | Event source component |
count | int | Event count |
Events Examples:
// Alert when pod sandbox creation fails
"condition": "reason == \"FailedCreatePodSandBox\""
// Alert on warning events for specific namespace
"condition": "type == \"Warning\" && involvedObject.namespace == \"default\""
// Alert when event count exceeds threshold
"condition": "count >= 5"
type: auditing)| Field | Type | Description |
|---|---|---|
verb | string | HTTP verb (get, post, put, delete, etc.) |
user | string | Username |
sourceIPs | string | Source IP addresses |
resource.group | string | Resource API group |
resource.version | string | Resource API version |
resource.resource | string | Resource type (pods, deployments, etc.) |
objectRef.name | string | Object name |
objectRef.namespace | string | Object namespace |
responseStatus.code | int | HTTP response code |
level | string | Audit level (None, Metadata, Request, RequestResponse) |
Auditing Examples:
// Alert on delete operations
"condition": "verb == \"delete\""
// Alert on failed requests (4xx/5xx)
"condition": "responseStatus.code >= 400"
// Alert on specific user activity
"condition": "user == \"admin\" && verb == \"delete\""
// Alert on sensitive resources
"condition": "resource.resource == \"secrets\""
type: logs)| Field | Type | Description |
|---|---|---|
log | string | Log message content |
container | string | Container name |
pod | string | Pod name |
namespace | string | Namespace name |
cluster | string | Cluster name |
Logs Examples:
// Alert on error keyword
"condition": "log contains \"error\""
// Alert on specific container
"condition": "container == \"nginx\""
// Alert on OOM kills
"condition": "log contains \"OOMKilled\""
// Alert on multiple keywords
"condition": "log contains \"failed\" && log contains \"connection\""
// Alert using regex
"condition": "log =~ \"error.*timeout|timeout.*error\""
Macros allow reusable expressions:
{
"expr": {
"kind": "macro",
"macro": "high_error_rate"
}
}
Lists allow grouping values:
{
"expr": {
"kind": "list",
"list": ["error", "warning", "critical"]
}
}
Aliases provide descriptive names for complex expressions:
{
"expr": {
"kind": "alias",
"alias": "Pod_Sandbox_Failure"
}
}
For log alerting with sliding window:
{
"name": "log-rate-rule",
"desc": "Log rate alert",
"enable": true,
"expr": {
"kind": "rule",
"condition": "log contains \"error\""
},
"alerts": {
"severity": "error",
"message": "High error log rate"
},
"slidingWindow": {
"windowSize": "5m",
"slidingInterval": "1m",
"count": 100
}
}
| Parameter | Type | Description |
|---|---|---|
slidingWindow.windowSize | string | Window size (e.g., "300ms", "5m") |
slidingWindow.slidingInterval | string | Slide step (must be less than windowSize) |
slidingWindow.count | int | Count threshold to trigger alert |
kubectl get installplan whizard-telemetry-ruler
kubectl get extensionversions -l kubesphere.io/extension-ref=whizard-telemetry-ruler
Uninstall from all clusters:
kubectl delete installplan whizard-telemetry-ruler
Uninstall from specific cluster:
To remove WizTelemetry Ruler from a specific cluster, update the InstallPlan by removing that cluster from clusterScheduling.placement.clusters:
apiVersion: kubesphere.io/v1alpha1
kind: InstallPlan
metadata:
name: whizard-telemetry-ruler
namespace: kubesphere-system
spec:
extension:
name: whizard-telemetry-ruler
version: <VERSION>
enabled: true
upgradeStrategy: Manual
clusterScheduling:
placement:
clusters:
- <REMAINING_CLUSTERS> # Remove the cluster you want to uninstall from
To send alerts through WizTelemetry Notification extension, configure the sink URL to point to the alertmanager-proxy service.
Auto-detection (recommended): Use the command from Step 3 to get the host node IP automatically.
whizard-telemetry-ruler:
config:
sinks:
- name: alertmanager
type: webhook
config:
url: http://<ALERT_MANAGER_HOST>:31093/api/v1/alerts
<ALERT_MANAGER_HOST>: From Step 3 (auto-detected or user-confirmed)alertmanager-proxy in kubesphere-system namespace