ワンクリックで
degraded-operator-recovery
// Troubleshoot ClusterOperator in Degraded, Unavailable, or not Progressing state. Use when operator status shows error conditions, reconciliation failures, or degraded health checks.
// Troubleshoot ClusterOperator in Degraded, Unavailable, or not Progressing state. Use when operator status shows error conditions, reconciliation failures, or degraded health checks.
Resolve a CVE vulnerability issue from Jira. Reads the CVE details, assesses impact, and either marks "not affected" with a Jira comment and transition, bumps the affected dependency, or implements a code fix. Use when the user says "cve", "resolve CVE", or provides a CVE Jira issue.
Assess whether an application is ready to migrate to OpenShift. Use when the user describes an application and asks about migration, containerization, or moving to Kubernetes/OpenShift.
Triage production incidents involving data corruption, data loss, slow performance, or outages. Classify severity and recommend immediate actions.
Update Python dependencies to latest versions using uv, regenerate lock and requirements.txt, then verify linting and tests pass. Fix breakage from API changes in bumped packages. Use when the user says "deps update", "bump dependencies", or "update deps".
Review PR with structured approach covering architecture, naming, patterns, and critical questions
Troubleshoot namespace stuck in Terminating state, ResourceQuota exhaustion, or RBAC permission denied errors. Use when resources cannot be created or forbidden errors occur.
| name | degraded-operator-recovery |
| description | Troubleshoot ClusterOperator in Degraded, Unavailable, or not Progressing state. Use when operator status shows error conditions, reconciliation failures, or degraded health checks. |
When a user reports unhealthy cluster operators or a stuck upgrade, follow this structured approach to identify the blocking condition and provide recovery steps.
Start by listing all cluster operators and their status conditions:
Degraded=True or Available=False.Progressing=True — these may be mid-reconciliation and need time, not intervention.kube-apiserver is degraded, other operators that depend on the API server will also report issues.Focus on the root cause operator — the one whose degradation is not explained by another operator's failure.
For each degraded operator:
message field on the Degraded condition usually contains the specific error.lastTransitionTime to understand how long the operator has been in this state.Do not skip this step. The condition message is the single most informative piece of data.
If the condition message is not sufficient:
openshift-* namespaces).If this is happening during or after a cluster upgrade:
oc get clusterversion for the upgrade status and any reported failures.Progressing=True during upgrade: advise waiting (up to the operator's expected rollout window) before intervening.Distinguish between "upgrade in progress" (normal) and "upgrade stuck" (needs intervention).
Once the blocking condition is identified:
oc adm certificate approve.oc get clusteroperators before diving deeper — this prevents chasing symptoms of a different root cause.Progressing=True, advise waiting before intervening — premature action can make the situation worse.