一键在 Manus 中运行任何 Skill

health

Comprehensive health assessment of deployed SPI environments — cluster infrastructure, Azure PaaS resources, workloads, and OSDU platform services. Use when the user asks about SPI environment health, cluster status, Azure PaaS health, CosmosDB status, Service Bus health, or wants a report on their deployed SPI environments. Trigger on phrases like "report on my SPI environment", "environment health", "how is my cluster", "cluster status", "is CosmosDB healthy", or "what's deployed". Not for: deploying or modifying infrastructure (use the iac skill), fork management (use the forks skill), installing tools (use the setup skill), or CIMPL environments (use cimpl:health).

在 Manus 中运行

概览

安装命令

npx skills add https://github.com/danielscholl/claude-osdu --skill health

复制此命令并粘贴到 Claude Code 中以安装该技能

来源

danielscholl/claude-osdu

星标1

分支0

更新时间2026年3月31日 00:48

SKILL.md

readonly

name	health
allowed-tools	Bash, Read, Glob
description	Comprehensive health assessment of deployed SPI environments — cluster infrastructure, Azure PaaS resources, workloads, and OSDU platform services. Use when the user asks about SPI environment health, cluster status, Azure PaaS health, CosmosDB status, Service Bus health, or wants a report on their deployed SPI environments. Trigger on phrases like "report on my SPI environment", "environment health", "how is my cluster", "cluster status", "is CosmosDB healthy", or "what's deployed". Not for: deploying or modifying infrastructure (use the iac skill), fork management (use the forks skill), installing tools (use the setup skill), or CIMPL environments (use cimpl:health).

SPI Environment Health Report

Comprehensive health assessment of deployed SPI environments — cluster infrastructure, Azure PaaS resources, workloads, and OSDU platform services.

The Iron Law

EVERY HEALTH REPORT MUST USE LIVE DATA — NEVER ASSUME STATUS

Quick Start

kubectl version --client && az version

If either is not found, stop and use the setup skill.

Report Procedure

Follow phases in order. Do NOT skip phases.

Phase 1: Enumerate Environments

# List azd environments
ls -d .azure/*/

# For each, extract key config
grep -E "^(AZURE_ENV_NAME|AZURE_LOCATION|AZURE_RESOURCE_GROUP)" .azure/<env>/.env

Phase 2: Connect to Cluster

az aks get-credentials -g <resource-group> -n <cluster-name>
kubelogin convert-kubeconfig -l azurecli

If connection fails, report the failure and skip to next environment.

Phase 3: Cluster Infrastructure Health

# Node status
kubectl get nodes -o wide

# Pod health
kubectl get pods -A --no-headers | grep -v Running | grep -v Completed

# Resource pressure
kubectl top nodes 2>/dev/null || echo "Metrics server not available"

Phase 4: Azure PaaS Health

This phase is unique to SPI. Check all Azure PaaS resources:

RG="<resource-group>"

# CosmosDB accounts
echo "=== CosmosDB ==="
az cosmosdb list -g "$RG" --query "[].{name:name, kind:kind, state:provisioningState}" -o table

# Service Bus namespaces
echo "=== Service Bus ==="
az servicebus namespace list -g "$RG" --query "[].{name:name, status:status}" -o table

# Storage accounts
echo "=== Storage ==="
az storage account list -g "$RG" --query "[].{name:name, status:statusOfPrimary, kind:kind}" -o table

# Key Vault
echo "=== Key Vault ==="
az keyvault list -g "$RG" --query "[].{name:name, state:properties.provisioningState}" -o table

Per-partition checks:

# CosmosDB SQL databases per partition
for acct in $(az cosmosdb list -g "$RG" --query "[?kind=='GlobalDocumentDB' && !contains(capabilities[].name, 'EnableGremlin')].name" -o tsv); do
  echo "=== $acct ==="
  az cosmosdb sql database list --account-name "$acct" -g "$RG" --query "[].{name:id}" -o table
  az cosmosdb sql container list --account-name "$acct" -g "$RG" --database-name "osdu-db" --query "[].{name:id}" -o table 2>/dev/null
done

# Service Bus topics per partition
for ns in $(az servicebus namespace list -g "$RG" --query "[].name" -o tsv); do
  TOPIC_COUNT=$(az servicebus topic list --namespace-name "$ns" -g "$RG" --query "length(@)" -o tsv 2>/dev/null || echo "?")
  echo "$ns: $TOPIC_COUNT topics"
done

Report format:

Resource	Instance	Status	Details
CosmosDB Gremlin	spi-env-graph	Succeeded	Entitlements graph
CosmosDB SQL	spi-env-opendes	Succeeded	24 containers
Service Bus	spi-env-opendes-bus	Active	14 topics
Storage (common)	spienvstorage	available	8 containers
Storage (partition)	spienvopendes	available	5 containers
Key Vault	spi-env-kv	Succeeded	Accessible

Phase 5: Workload Health

# Helm releases
helm list -A --no-headers 2>/dev/null

# Key namespace status
for ns in platform osdu osdu-core osdu-reference airflow cert-manager; do
  echo "=== $ns ==="
  kubectl get pods -n "$ns" --no-headers 2>/dev/null | awk '{print $3}' | sort | uniq -c
done

# Gateway / Ingress
kubectl get gateway -A 2>/dev/null
kubectl get httproute -A 2>/dev/null

# Certificates
kubectl get certificates -A 2>/dev/null

Phase 6: OSDU Platform Health

Use the OSDU MCP server tools if available:

osdu_health_check with include_services: true
osdu_partition_list
osdu_search_query (kind *:*:*:*, limit 1)
osdu_entitlements_mine

If MCP server is not configured, skip and note:

OSDU platform API health check skipped — MCP server not connected.

Phase 7: Summary

## SPI Environment Health: <env-name>

Overall: Healthy / Degraded / Unhealthy

### Infrastructure
- Nodes: X/Y Ready
- K8s: vX.Y.Z
- Pods: X running, Y failing

### Azure PaaS
- CosmosDB: X accounts (all Succeeded / N degraded)
- Service Bus: X namespaces (all Active / N degraded)
- Storage: X accounts (all available / N degraded)
- Key Vault: Accessible / Inaccessible

### Partitions
- opendes: CosmosDB 24 containers, Service Bus 14 topics, Storage 5 containers

### OSDU Services
- X/Y services healthy
- Key issues: [list any]

### Action Items
- [Any recommended actions]

Red Flags

Signal	Meaning
CosmosDB provisioningState != Succeeded	Database provisioning issue
Service Bus status != Active	Messaging disrupted
Storage statusOfPrimary != available	Object storage outage
Key Vault inaccessible	Secrets unavailable
Missing partition resources	Incomplete partition provisioning
Pods in CrashLoopBackOff	Application/config failure

Integration

Issues found → suggest iac skill to investigate
After fixes → re-run health check to verify
For CIMPL environments → use cimpl:health instead

同仓库更多 Skills

同仓库

briefing

danielscholl/claude-osdu

Generate daily OSDU briefing notes by aggregating GitLab MRs, SPI fork health, vault goals, and brain knowledge into an insightful daily note. Use when the user says "gm", "good morning", "briefing", "daily briefing", "morning standup", "what's on my plate", or "start my day". Not for: ad-hoc status checks or single-service queries.

2026-04-011

forks

danielscholl/claude-osdu

Fork management lifecycle for Azure OSDU SPI service forks — three-branch strategy, upstream sync, cascade integration, conflict resolution, template propagation, and multi-repo status monitoring across 8 service forks. Use when checking sync status, reviewing cascades, resolving conflicts, triggering manual syncs, managing fork_upstream/fork_integration/main branches, or coordinating template updates across osdu-spi-* repos. Not for: infrastructure deployment (use iac skill), OSDU GitLab services (use osdu plugin), or tool installation (use setup skill).

2026-03-311

iac

danielscholl/claude-osdu

Infrastructure as Code for Azure SPI — Terraform modules, Azure PaaS services (CosmosDB, Service Bus, Storage, Key Vault), Helm/Kustomize deployments, AKS Deployment Safeguards, azd integration, multi-partition support, and systematic debugging for the osdu-spi-infra repository. Use when working with SPI Terraform, Azure PaaS provisioning, azd up/down, Workload Identity, multi-partition resources, feature flags, blue/green stacks, deployment failures, or infrastructure verification. Not for: fork management (use the forks skill), CIMPL infrastructure (use cimpl:iac), OSDU platform services, or tool installation (use setup skill).

2026-03-311

setup

danielscholl/claude-osdu

Check and install CLI tool dependencies required by SPI skills. Use when the user says "setup", "check dependencies", "what do I need installed", or when a skill fails with "command not found". This does NOT set up a specific project. It ensures the tools needed by SPI skills are present on the machine. Not for: project-specific setup, IDE configuration, or environment provisioning.

2026-03-311

status

danielscholl/claude-osdu

Cross-repository status aggregation for SPI service forks — issues, pull requests, workflow runs, and alerts across all fork repos in a single dashboard. Highlights cascade-blocked issues, human-required labels, failing workflows, and pending sync/template-sync PRs. Configurable org via SPI_ORG. Use when checking overall fork health, asking what's open across repos, looking for blocked cascades, pending reviews, needing a dashboard of all SPI fork activity, or monitoring the state of the engineering system. Trigger on "status", "dashboard", "what's open", "any blocked cascades", "pending PRs across forks", "fork health overview", "failing workflows", "issues across repos". Not for: triggering syncs or cascades (use forks skill), infrastructure health or cluster status (use health skill), deploying (use iac skill), tool installation (use setup skill).

2026-03-311

send

danielscholl/claude-osdu

Ship local changes through a review-commit-push workflow. Performs a lite code review, runs quality checks, commits with a conventional commit message, pushes to remote, and creates a merge request (GitLab) or pull request (GitHub). Auto-detects the remote platform and available tools (worktrunk, aipr). Use when the user wants to send, ship, submit, or push their work, create an MR or PR, or says "send it", "ship it", "push my changes", or "I'm done, send this up". Not for: reviewing someone else's MR (use mr-review), contributing to another developer's MR (use contribute), or setting up tools (use setup).

2026-03-301

来源

danielscholl

danielscholl/claude-osdu

打开 GitHub 仓库查看创作者相关仓库

安装命令

下载

在 Manus 中运行

适用职业SOC

网络与计算机系统管理员计算机与数学类职业15-1244L4

name	health
allowed-tools	Bash, Read, Glob
description	Comprehensive health assessment of deployed SPI environments — cluster infrastructure, Azure PaaS resources, workloads, and OSDU platform services. Use when the user asks about SPI environment health, cluster status, Azure PaaS health, CosmosDB status, Service Bus health, or wants a report on their deployed SPI environments. Trigger on phrases like "report on my SPI environment", "environment health", "how is my cluster", "cluster status", "is CosmosDB healthy", or "what's deployed". Not for: deploying or modifying infrastructure (use the iac skill), fork management (use the forks skill), installing tools (use the setup skill), or CIMPL environments (use cimpl:health).

SPI Environment Health Report

Comprehensive health assessment of deployed SPI environments — cluster infrastructure, Azure PaaS resources, workloads, and OSDU platform services.

The Iron Law

EVERY HEALTH REPORT MUST USE LIVE DATA — NEVER ASSUME STATUS

Quick Start

kubectl version --client && az version

If either is not found, stop and use the setup skill.

Report Procedure

Follow phases in order. Do NOT skip phases.

Phase 1: Enumerate Environments

# List azd environments
ls -d .azure/*/

# For each, extract key config
grep -E "^(AZURE_ENV_NAME|AZURE_LOCATION|AZURE_RESOURCE_GROUP)" .azure/<env>/.env

Phase 2: Connect to Cluster

az aks get-credentials -g <resource-group> -n <cluster-name>
kubelogin convert-kubeconfig -l azurecli

If connection fails, report the failure and skip to next environment.

Phase 3: Cluster Infrastructure Health

# Node status
kubectl get nodes -o wide

# Pod health
kubectl get pods -A --no-headers | grep -v Running | grep -v Completed

# Resource pressure
kubectl top nodes 2>/dev/null || echo "Metrics server not available"

Phase 4: Azure PaaS Health

This phase is unique to SPI. Check all Azure PaaS resources:

RG="<resource-group>"

# CosmosDB accounts
echo "=== CosmosDB ==="
az cosmosdb list -g "$RG" --query "[].{name:name, kind:kind, state:provisioningState}" -o table

# Service Bus namespaces
echo "=== Service Bus ==="
az servicebus namespace list -g "$RG" --query "[].{name:name, status:status}" -o table

# Storage accounts
echo "=== Storage ==="
az storage account list -g "$RG" --query "[].{name:name, status:statusOfPrimary, kind:kind}" -o table

# Key Vault
echo "=== Key Vault ==="
az keyvault list -g "$RG" --query "[].{name:name, state:properties.provisioningState}" -o table

Per-partition checks:

# CosmosDB SQL databases per partition
for acct in $(az cosmosdb list -g "$RG" --query "[?kind=='GlobalDocumentDB' && !contains(capabilities[].name, 'EnableGremlin')].name" -o tsv); do
  echo "=== $acct ==="
  az cosmosdb sql database list --account-name "$acct" -g "$RG" --query "[].{name:id}" -o table
  az cosmosdb sql container list --account-name "$acct" -g "$RG" --database-name "osdu-db" --query "[].{name:id}" -o table 2>/dev/null
done

# Service Bus topics per partition
for ns in $(az servicebus namespace list -g "$RG" --query "[].name" -o tsv); do
  TOPIC_COUNT=$(az servicebus topic list --namespace-name "$ns" -g "$RG" --query "length(@)" -o tsv 2>/dev/null || echo "?")
  echo "$ns: $TOPIC_COUNT topics"
done

Report format:

Resource	Instance	Status	Details
CosmosDB Gremlin	spi-env-graph	Succeeded	Entitlements graph
CosmosDB SQL	spi-env-opendes	Succeeded	24 containers
Service Bus	spi-env-opendes-bus	Active	14 topics
Storage (common)	spienvstorage	available	8 containers
Storage (partition)	spienvopendes	available	5 containers
Key Vault	spi-env-kv	Succeeded	Accessible

Phase 5: Workload Health

# Helm releases
helm list -A --no-headers 2>/dev/null

# Key namespace status
for ns in platform osdu osdu-core osdu-reference airflow cert-manager; do
  echo "=== $ns ==="
  kubectl get pods -n "$ns" --no-headers 2>/dev/null | awk '{print $3}' | sort | uniq -c
done

# Gateway / Ingress
kubectl get gateway -A 2>/dev/null
kubectl get httproute -A 2>/dev/null

# Certificates
kubectl get certificates -A 2>/dev/null

Phase 6: OSDU Platform Health

Use the OSDU MCP server tools if available:

osdu_health_check with include_services: true
osdu_partition_list
osdu_search_query (kind *:*:*:*, limit 1)
osdu_entitlements_mine

If MCP server is not configured, skip and note:

OSDU platform API health check skipped — MCP server not connected.

Phase 7: Summary

## SPI Environment Health: <env-name>

Overall: Healthy / Degraded / Unhealthy

### Infrastructure
- Nodes: X/Y Ready
- K8s: vX.Y.Z
- Pods: X running, Y failing

### Azure PaaS
- CosmosDB: X accounts (all Succeeded / N degraded)
- Service Bus: X namespaces (all Active / N degraded)
- Storage: X accounts (all available / N degraded)
- Key Vault: Accessible / Inaccessible

### Partitions
- opendes: CosmosDB 24 containers, Service Bus 14 topics, Storage 5 containers

### OSDU Services
- X/Y services healthy
- Key issues: [list any]

### Action Items
- [Any recommended actions]

Red Flags

Signal	Meaning
CosmosDB provisioningState != Succeeded	Database provisioning issue
Service Bus status != Active	Messaging disrupted
Storage statusOfPrimary != available	Object storage outage
Key Vault inaccessible	Secrets unavailable
Missing partition resources	Incomplete partition provisioning
Pods in CrashLoopBackOff	Application/config failure

Integration

Issues found → suggest iac skill to investigate
After fixes → re-run health check to verify
For CIMPL environments → use cimpl:health instead