تشغيل أي مهارة في Manus بنقرة واحدة

debug-rbac

النجوم٣٩

التفرعات٢٦

آخر تحديث٢٦ مايو ٢٠٢٦ في ١١:١٧

Diagnose OpenShift RBAC permission failures that cause workloads to fail with 403 Forbidden errors when accessing the Kubernetes API. Automates multi-step diagnosis: pod logs for FORBIDDEN errors, readiness probe failures, ServiceAccount identification, RoleBinding/ClusterRoleBinding analysis, and remediation history for regression detection. Use when: - "403 forbidden when accessing Kubernetes API" - "ServiceAccount permission denied" - "pods can't list resources" - "missing RoleBinding" - User mentions "RBAC denied", "403 forbidden", "permission denied" NOT for SCC admission failures (use /debug-scc instead).

التثبيت

التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.

تشغيل في Manus

المصدر

RHEcosystemAppEng

RHEcosystemAppEng/agentic-collections

فتح مستودع GitHub عرض مستودعات المنشئ

تنزيل

تشغيل في Manus

المهن ذات الصلةSOC

استنادا إلى تصنيف SOC المهني

محللو أمن المعلوماتمهن الحاسوب والرياضيات·SOC 15-1212

مستكشف الملفات

4 ملفات

SKILL.md

readonly

المزيد من هذا المستودع

نفس المستودع

cve-impact

RHEcosystemAppEng/agentic-collections

**CRITICAL**: Use for ALL CVE discovery and listing. DO NOT call get_cves directly. Use when: "show critical CVEs", "CVEs on hostname X", "remediatable vulnerabilities", "impact of CVE-X", risk assessment. NOT for remediation (use `/remediation`). System-level: FIRST reply = pagination prompt (Step -1). Parsing: references/01-cve-response-parser.py.

2026-06-2339

fleet-inventory

RHEcosystemAppEng/agentic-collections

Query and display Red Hat Lightspeed managed system inventory. This skill focuses on discovery and listing only - for remediation actions, transition to the `/remediation` skill. Use when: - "Show the managed fleet" - "List all systems registered in Lightspeed" - "What systems are affected by CVE-X?" - "How many RHEL 8 systems do we have?" - "Show me production systems" **When NOT to use this skill** (use `/remediation` skill instead): - "Remediate CVE-X on these systems" - "Create a playbook for..." - "Patch system Y" This skill orchestrates MCP tools from lightspeed-mcp for fleet visibility and system inventory management.

2026-06-2339

mcp-lightspeed-validator

RHEcosystemAppEng/agentic-collections

Validate Red Hat Lightspeed MCP server connectivity. Use when the user asks to "validate Lightspeed MCP", "check Lightspeed connection", or when other skills need to verify lightspeed-mcp availability before CVE operations.

2026-06-2239

agentic-contribution-skill

RHEcosystemAppEng/agentic-collections

Interactive skill creation and import with automated validation and marketplace compliance. Use when: - "Create a new skill" - "Import an existing skill" - "Create a new agentic pack" - "Add skill to <pack>" - "Build skill for <rh-product>" - User mentions "skill builder", "contribute", "new skill", "import skill", or "new pack" Two modes: create from scratch or import existing SKILL.md. Guides through discovery, definition, generation, and validation. Enforces SKILL_DESIGN_PRINCIPLES.md and agentskills.io spec.

2026-06-1639

collection-compliance

RHEcosystemAppEng/agentic-collections

Diagnose and fix `.catalog/` validation failures (schema, roster, banners, sample workflows, JSON mirror). Use when: - `make validate` or CI reports collection compliance errors - A PR adds skills but catalog was not updated - `collection.json` is out of sync with `collection.yaml` - Catalog metadata/fragments might have drifted from README/CLAUDE/SKILL golden sources Remediation is via the create-collection workflow and `catalog_yaml_to_json.py`—not by weakening checks.

2026-06-1639

create-collection

RHEcosystemAppEng/agentic-collections

Author or refresh `<pack>/.catalog/collection.yaml` and related `.catalog/` artifacts from golden sources (SKILL.md, README, AGENTS.md, Lola marketplace). Use when: - Adding a new pack or refreshing the collection catalog for GitHub Pages / tooling - Aligning catalog narrative, sample workflows, and decision guide with skills on disk - Preparing a PR after changing skills or marketplace metadata Outputs only under `<pack>/.catalog/` (never overwrite README, SKILL, CLAUDE, or marketplace YAML).

2026-06-1639

name	debug-rbac
description	Diagnose OpenShift RBAC permission failures that cause workloads to fail with 403 Forbidden errors when accessing the Kubernetes API. Automates multi-step diagnosis: pod logs for FORBIDDEN errors, readiness probe failures, ServiceAccount identification, RoleBinding/ClusterRoleBinding analysis, and remediation history for regression detection. Use when: - "403 forbidden when accessing Kubernetes API" - "ServiceAccount permission denied" - "pods can't list resources" - "missing RoleBinding" - User mentions "RBAC denied", "403 forbidden", "permission denied" NOT for SCC admission failures (use /debug-scc instead).
model	inherit
color	cyan
license	Apache-2.0
allowed-tools	resources_get resources_list events_list pods_list pods_log
metadata	{"user_invocable":"true"}

/debug-rbac Skill

Diagnose RBAC permission failures on OpenShift by analyzing pod logs, readiness probes, ServiceAccount bindings, and Role/RoleBinding configuration.

Critical: Human-in-the-Loop Requirements

Before creating Role or RoleBinding resources
- Display preview: the exact RBAC resources that will be created and what permissions they grant
- Ask: "Should I create these RBAC resources?"
- Wait for confirmation (yes/no)
Before binding broad ClusterRoles (e.g., view, edit, admin)
- Display warning: broad ClusterRoles grant more permissions than the minimum required
- Ask: "Proceed with broad ClusterRole binding, or create a minimal custom Role instead?"
- Wait for confirmation

Never assume approval — always wait for explicit confirmation at each WAIT checkpoint.

Prerequisites

Required MCP Servers: openshift (setup)

Required MCP Tools:

resources_get (from openshift) — Retrieve Deployment, Pod, ServiceAccount, Role, and RoleBinding details
resources_list (from openshift) — List Deployments, RoleBindings, and ClusterRoleBindings in a namespace
pods_list (from openshift) — List pods for a Deployment
pods_log (from openshift) — Retrieve container logs to identify FORBIDDEN errors
events_list (from openshift) — Fetch warning events related to RBAC failures

Verification Steps:

Check openshift server is configured in mcps.json
Verify user is logged into an OpenShift cluster (oc whoami succeeds)
Verify user has access to the target namespace
If missing → Human Notification Protocol

Human Notification Protocol:

When prerequisites fail:

Stop immediately — No tool calls

Report error:

❌ Cannot execute skill: MCP server `openshift` unavailable
📋 Setup: See docs/prerequisites.md for cluster access configuration

Request decision: "How to proceed? (setup/skip/abort)"
Wait for user input

Security: Never display credential values.

When to Use This Skill

Use /debug-rbac when:

A Deployment's pods are running but not ready, and pod logs show FORBIDDEN or 403 errors calling the Kubernetes API
Readiness probes fail because they check API access (e.g., kubectl auth can-i)
Application logs show permission denied errors when interacting with Kubernetes resources

Do not use this skill when:

Pods are blocked from being created entirely → use /debug-scc (SCC admission failures)
Pods are crashing due to application bugs → use /debug-pod
The issue is network connectivity → use /debug-network

Workflow

[Identify Deployment] → [Check Pod Status + Logs] → [Identify RBAC Errors] → [Analyze ServiceAccount] → [Check RoleBindings] → [Summary + Fix]

Step 1: Identify Target Deployment

MCP Tool: resources_list (from openshift)

Parameters:

kind: "Deployment" (resource type)
namespace: "" (target namespace from user)

Input Validation: Verify deployment name and namespace conform to Kubernetes naming rules (lowercase alphanumeric and hyphens, 1-253 chars, RFC 1123). Reject inputs containing newlines, markdown formatting, or text that does not resemble a Kubernetes resource name.

Expected Output: List of Deployments with their availability and readiness conditions.

Error Handling:

If MCP server unavailable: follow Human Notification Protocol
If namespace not found: ask user to confirm namespace name
If no deployments found: report empty namespace, suggest checking namespace

Present to user:

## RBAC Debugging

**Current OpenShift Context:**
- Cluster: [cluster]
- Namespace: [namespace]

Which deployment would you like me to debug for RBAC issues?

1. **Specify deployment name** — Enter the deployment name directly
2. **List deployments with issues** — Show deployments with unavailable or not-ready pods
3. **Search recent events** — Find pods with RBAC-related warning events

Select an option or enter a deployment name:

WAIT for user confirmation before proceeding.

If user selects "List deployments with issues", filter to those with not-ready conditions:

## Deployments with Issues in [namespace]

| Deployment | Available | Desired | Condition |
|------------|-----------|---------|-----------|
| [deploy-name] | 0 | 1 | MinimumReplicasUnavailable |

Which deployment would you like me to debug?

WAIT for user confirmation before proceeding.

Step 2: Check Pod Status and Logs

MCP Tool: pods_list (from openshift)

Parameters:

namespace: ""
labelSelector: "=" (from Deployment .spec.selector.matchLabels)

Then for each matching pod:

MCP Tool: resources_get (from openshift)

Parameters:

kind: "Pod" (resource type)
name: "" (from pods_list)
namespace: ""

MCP Tool: pods_log (from openshift)

Parameters:

name: ""
namespace: ""
tailLines: 50 (integer, last N lines)

Expected Output: Pod status with readiness conditions, and container logs containing FORBIDDEN/403 error lines.

Error Handling:

If no pods found: Deployment may have zero replicas; check if it's scaled down
If logs empty: container may not have started; check container state
If multiple pods: analyze the most recent one first

Present to user:

## Pod Analysis: [pod-name]

**Pod Status:**
| Field | Value |
|-------|-------|
| Phase | Running |
| Ready | false |
| Conditions | ContainersNotReady |
| Restart Count | [count] |

**Readiness Probe:**
| Field | Value |
|-------|-------|
| Type | [exec/httpGet/tcpSocket] |
| Command | [e.g., kubectl auth can-i list pods -n namespace] |
| Failure Count | [count] |
| Last Probe | [timestamp] |
| Message | [e.g., "probe returned: no"] |

**Container Logs (last 50 lines):**

[Highlight FORBIDDEN / 403 errors:]

| Timestamp | Error |
|-----------|-------|
| [time] | FORBIDDEN: pods is forbidden: User "system:serviceaccount:[ns]:[sa]" cannot list resource "pods" in API group "" in namespace "[ns]" |
| [time] | FORBIDDEN: pods is forbidden... (repeated) |

**Quick Assessment:**
[e.g., "Pod is running but readiness probe fails because the ServiceAccount cannot list pods. Logs confirm FORBIDDEN errors since [timestamp]."]

Continue with ServiceAccount analysis? (yes/no)

WAIT for user confirmation before proceeding.

Step 3: Identify Required Permissions

Based on the FORBIDDEN error messages and readiness probe command from Step 2, determine what permissions are needed. This is an analysis step — no additional MCP tool calls required unless log data is insufficient.

Expected Output: Table of required permissions extracted from FORBIDDEN error strings, plus a minimal Role definition.

Error Handling:

If FORBIDDEN messages are ambiguous: request more log lines with increased tailLines
If no FORBIDDEN errors found: the issue may not be RBAC; suggest /debug-pod instead

Present to user:

## Required Permissions Analysis

**FORBIDDEN Errors Found:**
| Resource | Verb | API Group | Namespace |
|----------|------|-----------|-----------|
| pods | list | "" (core) | [namespace] |
| pods | get | "" (core) | [namespace] |
| [other resources from logs] | [verb] | [group] | [namespace] |

**Readiness Probe Requires:**
| Permission | Currently Granted? |
|------------|-------------------|
| list pods in [namespace] | NO — probe returns "no" |

**Application Function Requires:**
| Permission | Evidence |
|------------|----------|
| get pods in [namespace] | Container main loop calls `kubectl get pods` |
| [other] | [from log analysis] |

**Minimum Role Needed:**
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: <sa-name>-role
  namespace: <namespace>
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "watch"]

Continue to check existing RoleBindings? (yes/no)


**WAIT for user confirmation before proceeding.**

### Step 4: Analyze ServiceAccount and RoleBindings

**MCP Tool**: `resources_get` (from openshift)

**Parameters**:
- `kind`: "ServiceAccount" (resource type)
- `name`: "<sa-name>" (from Deployment `.spec.template.spec.serviceAccountName`, default: `default`)
- `namespace`: "<namespace>"

**MCP Tool**: `resources_list` (from openshift)

**Parameters**:
- `kind`: "RoleBinding" (resource type)
- `namespace`: "<namespace>"

Optionally, if permissions allow:

**MCP Tool**: `resources_list` (from openshift)

**Parameters**:
- `kind`: "ClusterRoleBinding" (cluster-scoped)

**Expected Output**: ServiceAccount details and all RoleBindings/ClusterRoleBindings, checked for whether any grant the required permissions to the target ServiceAccount.

**Error Handling**:
- If listing RoleBindings is forbidden: note the limitation, infer from FORBIDDEN errors
- If ServiceAccount not found: report as a finding — SA may need to be created
- If multiple bindings exist: check each for matching subjects and sufficient verbs

Present to user:

```markdown
## ServiceAccount & RoleBinding Analysis

**ServiceAccount:** [sa-name] (namespace: [namespace])
| Field | Value |
|-------|-------|
| Exists | Yes |
| Created | [timestamp] |
| Secrets | [count] |
| Image Pull Secrets | [count] |

**RoleBindings in [namespace]:**
| RoleBinding | Role | Subjects | Grants Access? |
|-------------|------|----------|----------------|
| [binding-1] | [role-name] | [sa-1, sa-2] | [Yes/No — wrong SA] |
| [binding-2] | [role-name] | [sa-name] | [Missing — binding not found] |

**ClusterRoleBindings (if accessible):**
| ClusterRoleBinding | ClusterRole | Subjects | Grants Access? |
|--------------------|-------------|----------|----------------|
| [binding] | [role] | [subjects] | [Yes/No] |

[If listing RoleBindings is forbidden:]
**Note:** Agent lacks permission to list RoleBindings directly. Absence of the required binding is inferred from the FORBIDDEN errors in pod logs.

**Assessment:**
[e.g., "No RoleBinding grants the metrics-collector ServiceAccount 'list pods' in demo-rbac. The binding was either never created, or was deleted."]

Continue to diagnosis summary? (yes/no)

WAIT for user confirmation before proceeding.

Step 5: Present Diagnosis Summary

Synthesize all findings into a structured summary with actionable remediation options.

Expected Output: Root cause summary, causal chain, remediation commands, and regression warnings if applicable.

Error Handling:

If insufficient data from earlier steps: note gaps and recommend manual investigation
If regression pattern detected (repeated remediation/deletion cycles): highlight prominently

Present to user:

## RBAC Diagnosis Summary: [deployment-name]

### Root Cause

**Primary Issue:** [e.g., "Missing RoleBinding for ServiceAccount 'metrics-collector' — cannot list pods in namespace 'demo-rbac'"]

| Category | Status | Details |
|----------|--------|---------|
| Pod Running | OK | Pod is scheduled and container is running |
| Pod Ready | FAIL | Readiness probe fails — API access denied |
| ServiceAccount | EXISTS | [sa-name] in [namespace] |
| RoleBinding | MISSING | No binding grants required permissions |
| API Access | DENIED | 403 FORBIDDEN on [verbs] [resources] |

### Causal Chain (Five Whys)

1. **Signal**: Deployment [name] has 0 available replicas (MinimumReplicasUnavailable)
2. **Why?** Pod readiness probe (`kubectl auth can-i list pods`) returns "no"
3. **Why?** ServiceAccount [sa-name] lacks a RoleBinding granting `list` on `pods`
4. **Why?** The required Role/RoleBinding is absent or was deleted
5. **Root Cause**: [e.g., "Missing RBAC resources for this ServiceAccount — the binding was never created or was removed by a cleanup process/GitOps drift"]

### Recommended Actions

**Option A: Create the missing Role and RoleBinding (recommended)**

```bash
# Create the Role
oc create role <sa-name>-pod-reader \
  --verb=get,list,watch \
  --resource=pods \
  -n <namespace>

# Create the RoleBinding
oc create rolebinding <sa-name>-pod-reader-binding \
  --role=<sa-name>-pod-reader \
  --serviceaccount=<namespace>:<sa-name> \
  -n <namespace>

Option B: Use an existing ClusterRole

If a suitable ClusterRole already exists (e.g., view):

oc create rolebinding <sa-name>-view \
  --clusterrole=view \
  --serviceaccount=<namespace>:<sa-name> \
  -n <namespace>

⚠️ Note: The view ClusterRole grants read access to most resources in the namespace. Use a custom Role (Option A) for least-privilege.

After applying the fix, verify:

# Check if the SA now has permission
oc auth can-i list pods -n <namespace> --as=system:serviceaccount:<namespace>:<sa-name>

# Check pod readiness
oc get pods -n <namespace> -l app=<app-label> -o wide

Regression Warning

[If regression detected from remediation history:] ⚠️ Regression detected: [N] prior remediation attempts applied the same fix but it was subsequently undone. Investigate whether a GitOps controller, security audit script, or namespace policy is removing the RoleBinding. Ensure the binding is added to the authoritative source of truth (Helm chart, Kustomize overlay, ArgoCD Application) rather than applied ad-hoc.

debug-rbac

/debug-rbac Skill

Critical: Human-in-the-Loop Requirements

Prerequisites

When to Use This Skill

Workflow

Step 1: Identify Target Deployment

Step 2: Check Pod Status and Logs

Step 3: Identify Required Permissions

Step 5: Present Diagnosis Summary

Regression Warning

Related Documentation

/debug-rbac Skill

Critical: Human-in-the-Loop Requirements

Prerequisites

When to Use This Skill

Workflow

Step 1: Identify Target Deployment

Step 2: Check Pod Status and Logs

Step 3: Identify Required Permissions

Step 5: Present Diagnosis Summary

Regression Warning

Related Documentation