Run any Skill in Manus with one click

kubernetes-operator

Kubernetes cluster operator for the Marshian Galaxy home lab. Use when managing workloads, namespaces, node operations, Cilium networking, or kubectl/helm workflows on the k8s0–k8s4 cluster.

Run Skill in Manus

Overview

Kubernetes cluster operator for the Marshian Galaxy home lab. Use when managing workloads, namespaces, node operations, Cilium networking, or kubectl/helm workflows on the k8s0–k8s4 cluster.

Install command

npx skills add https://github.com/icub3d/dotfiles --skill kubernetes-operator

Copy and paste this command into Claude Code to install the skill

Source

icub3d/dotfiles

Stars0

Forks0

UpdatedJune 4, 2026 at 02:27

SKILL.md

readonly

name	kubernetes-operator
description	Kubernetes cluster operator for the Marshian Galaxy home lab. Use when managing workloads, namespaces, node operations, Cilium networking, or kubectl/helm workflows on the k8s0–k8s4 cluster.

Kubernetes Operator

Overview

This skill manages the Marshian Galaxy Kubernetes cluster — a multi-node home lab running on Alpine Linux VMs with Cilium for networking. It covers day-to-day kubectl operations, workload management, node health, and the custom Nushell helpers in nushell/bin/kubernetes.nu.

Cluster Topology

Node	Role	OS
`k8s0`, `k8s1`, `k8s2`	Control Plane + Worker	Alpine Linux
`k8s3`, `k8s4`	Worker	Alpine Linux
`srv2`	NFS / Minecraft services VM	Alpine Linux
`wireguard`	VPN gateway VM	Alpine Linux
`pihole`	DNS / ad-blocker VM	Debian

Service VIP accessibility is verified via https://git.marsh.gg (Cilium load balancer).

Core Capabilities

1. Node Operations

Check cluster health: kubectl get nodes -o wide
Cordon before maintenance: kubectl cordon <node>
Drain safely: kubectl drain <node> --ignore-daemonsets --delete-emptydir-data --force
Uncordon after: kubectl uncordon <node>
For reboots of the full cluster, delegate to the galaxy-rebooter skill.

2. Workload Management

Deploy/update: kubectl apply -f <manifest> or helm upgrade --install
Rollout status: kubectl rollout status deployment/<name> -n <ns>
Rollback: kubectl rollout undo deployment/<name>
Logs: kubectl logs -n <ns> <pod> --tail=100 -f

3. Nushell Kubernetes Helpers

The file nushell/bin/kubernetes.nu contains custom commands. When editing:

Follow Nushell typing conventions (see nushell-orchestrator skill).
Keep outputs as structured tables so they pipe naturally into where, sort-by, etc.
Avoid shelling out to jq; use Nushell's from json instead.

4. Cilium Networking

Cilium is the CNI — use cilium status and cilium connectivity test to diagnose network issues.
The cluster VIP for external services resolves via Cilium's LB; verify with curl -s https://git.marsh.gg.
When adding new LoadBalancer services, ensure externalTrafficPolicy and IP pool annotations match the existing pattern.

5. Control Plane API Transience

During control plane node reboots, the Kubernetes API becomes temporarily unavailable. Any script that polls kubectl get node must:

Wrap in try { ... } catch { ... } (Nushell) or equivalent retry logic.
Wait for SSH to return before attempting kubectl commands against that node.
Expect transient auth/refused/forbidden errors — treat them as retry triggers, not hard failures.

6. Common Troubleshooting

Pod stuck Pending: check kubectl describe pod <name> for resource/affinity/taint issues.
Node NotReady: check kubectl describe node <name> and SSH in to check kubelet logs (rc-service kubelet status).
DNS failures: check pihole is up; kubectl exec -it <pod> -- nslookup kubernetes.default.

Examples

"Show me all pods across all namespaces that aren't running."
"Add a new namespace for my monitoring stack and deploy a Helm chart into it."
"One of the workers is NotReady — help me diagnose."
"Update the kubernetes.nu helper to show resource requests alongside pod status."

More from this repository

same repository

git-workflow

icub3d/dotfiles

Dotfiles-specific git workflow manager. Use when committing, branching, managing symlink history, rebasing, or handling the multi-machine/multi-platform nature of this repo.

2026-06-040

rust-smith

icub3d/dotfiles

Specialized in Rust toolchains, cargo config optimizations, crate profiling, build script setups, and memory/performance optimizations (including the cached crate memoization strategy). Use when developing or optimizing Rust projects.

2026-06-040

ssh-config-manager

icub3d/dotfiles

SSH config manager for the Marshian Galaxy multi-host setup. Use when adding/editing Host blocks, configuring ProxyJump chains, managing identities, or debugging SSH connectivity to cluster nodes, VMs, and git.marsh.gg.

2026-06-040

systemd-manager

icub3d/dotfiles

Manages systemd user and system unit files in the dotfiles repo — services, timers, socket activation, device dependencies, and install/reload workflows. Use when creating or editing .service, .timer, or .socket files in helpers/ or dotfiles/.config/systemd/.

2026-06-040

wireguard-configurator

icub3d/dotfiles

WireGuard VPN configuration manager for the Marshian Galaxy. Use when adding/removing peers, rotating keys, editing interface config on the wireguard VM, or debugging VPN connectivity.

2026-06-040

arch-provisioner

icub3d/dotfiles

Arch Linux package auditor and provisioning orchestrator. Use when adding, removing, or auditing packages in the pacman lists, or updating post-install Nushell hooks.

2026-06-030

Source

icub3d

icub3d/dotfiles

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

name	kubernetes-operator
description	Kubernetes cluster operator for the Marshian Galaxy home lab. Use when managing workloads, namespaces, node operations, Cilium networking, or kubectl/helm workflows on the k8s0–k8s4 cluster.

Kubernetes Operator

Overview

Cluster Topology

Node	Role	OS
`k8s0`, `k8s1`, `k8s2`	Control Plane + Worker	Alpine Linux
`k8s3`, `k8s4`	Worker	Alpine Linux
`srv2`	NFS / Minecraft services VM	Alpine Linux
`wireguard`	VPN gateway VM	Alpine Linux
`pihole`	DNS / ad-blocker VM	Debian

Service VIP accessibility is verified via https://git.marsh.gg (Cilium load balancer).

Core Capabilities

1. Node Operations

Check cluster health: kubectl get nodes -o wide
Cordon before maintenance: kubectl cordon <node>
Drain safely: kubectl drain <node> --ignore-daemonsets --delete-emptydir-data --force
Uncordon after: kubectl uncordon <node>
For reboots of the full cluster, delegate to the galaxy-rebooter skill.

2. Workload Management

Deploy/update: kubectl apply -f <manifest> or helm upgrade --install
Rollout status: kubectl rollout status deployment/<name> -n <ns>
Rollback: kubectl rollout undo deployment/<name>
Logs: kubectl logs -n <ns> <pod> --tail=100 -f

3. Nushell Kubernetes Helpers

The file nushell/bin/kubernetes.nu contains custom commands. When editing:

Follow Nushell typing conventions (see nushell-orchestrator skill).
Keep outputs as structured tables so they pipe naturally into where, sort-by, etc.
Avoid shelling out to jq; use Nushell's from json instead.

4. Cilium Networking

Cilium is the CNI — use cilium status and cilium connectivity test to diagnose network issues.
The cluster VIP for external services resolves via Cilium's LB; verify with curl -s https://git.marsh.gg.
When adding new LoadBalancer services, ensure externalTrafficPolicy and IP pool annotations match the existing pattern.

5. Control Plane API Transience

During control plane node reboots, the Kubernetes API becomes temporarily unavailable. Any script that polls kubectl get node must:

Wrap in try { ... } catch { ... } (Nushell) or equivalent retry logic.
Wait for SSH to return before attempting kubectl commands against that node.
Expect transient auth/refused/forbidden errors — treat them as retry triggers, not hard failures.

6. Common Troubleshooting

Pod stuck Pending: check kubectl describe pod <name> for resource/affinity/taint issues.
Node NotReady: check kubectl describe node <name> and SSH in to check kubelet logs (rc-service kubelet status).
DNS failures: check pihole is up; kubectl exec -it <pod> -- nslookup kubernetes.default.

Examples

"Show me all pods across all namespaces that aren't running."
"Add a new namespace for my monitoring stack and deploy a Helm chart into it."
"One of the workers is NotReady — help me diagnose."
"Update the kubernetes.nu helper to show resource requests alongside pod status."