Run any Skill in Manus with one click

$pwd:

youtube-thumbnail-generate

Name: Youtube Thumbnail Generate
Author: naveedharri

// Generate on-brand YouTube thumbnails for Ben van Sprundel using Higgsfield in one shot. Use when the user says create a thumbnail, make a YT thumbnail, thumbnail for video, generate ben thumbnail, variation of last thumbnail, or shares a video concept and asks for a thumbnail. Auto-infers mode from the inputs (variation, new-with-ben, ben-plus-other, no-face), defaults to 3 variants, and asks at most one question. Uses reference images as the identity anchor (a current photo of Ben from refs/, plus optional past thumbnails or style anchors). No Soul training is required. Reads the locked style spec at Context/youtube-thumbnail-style.md if present. Real logos are never rendered. The thumbnail fills its frame edge-to-edge; the user composites the actual logo on top in post.

Run Skill in Manus

$ git log --oneline --stat

stars:28

forks:14

updated:May 14, 2026 at 13:06

File Explorer

7 files

SKILL.md

readonly

related-skills.json

same repository

agentic-os-standalone.md

from "naveedharri/benai-skills"

Set up an agentic OS as a standalone Next.js web dashboard — wired to the Claude Agent SDK plus up to 7 live MCP integrations (Circle community, Fireflies meetings, YouTube/VidIQ, Unipile LinkedIn DMs, Apify Twitter/X, Reddit), per-profile views, a button-bar of actions, snapshot-style data refreshes, and an optional Railway deploy with HTTP basic auth and a persistent volume. A real web app — runs locally or live. Use when the user says "set up a standalone agentic OS", "build an MCP-powered web dashboard", "spin up my AI ops dashboard", "deploy an AI dashboard to Railway", "give me a web version of the dashboard", or asks to personalize a web dashboard previously created with this skill.

2026-05-2728

agentic-os-obsidian.md

from "naveedharri/benai-skills"

Set up an agentic OS inside an Obsidian vault — a configurable command-center dashboard with 5 auto-installed, bundled plugins (Dataview, CustomJS, Shell-commands, Terminal, Homepage), Home + per-profile + Vault Overview pages, KPI cards, sparklines, heatmap, task rollup, and a button bar wired to user-defined Claude prompts. Markdown-native, no servers. Use when the user says "set up agentic OS in Obsidian", "install command center in my vault", "build a vault dashboard", "give me my own version of the dashboard inside Obsidian", "set up my second brain dashboard", or asks to personalize a vault dashboard previously created with this skill.

2026-05-2728

agentic-os-setup.md

from "naveedharri/benai-skills"

Set up an agentic OS — either inside an Obsidian vault (bundled command-center dashboard, 5 auto-installed plugins, button bar wired to Claude prompts) OR as a standalone Next.js web dashboard with live MCP integrations (Circle, Fireflies, YouTube/VidIQ, Unipile LinkedIn DMs, Apify Twitter, Reddit), Anthropic Agent SDK refreshes, and optional Railway deploy. Use when the user says "set up agentic OS", "install command center", "bootstrap a personal AI dashboard", "build a vault dashboard", "spin up an MCP-powered dashboard", "deploy an AI ops dashboard", "give me my own version of the dashboard", "set up my second brain dashboard", or asks to personalize a dashboard previously created with this skill. Skill asks one routing question first — Obsidian or standalone — then runs the matching full flow.

2026-05-2628

os-setup.md

from "naveedharri/benai-skills"

Bootstrap the BenAI OS Plugin vault structure and run personalized onboarding. Creates all directories, system files, Obsidian config, memory system, hooks, and output styles, then interviews the user to personalize everything. Two modes — Solopreneurs/Professionals (default), Business/Teams. Use when user says "set up", "bootstrap", "initialize", "onboarding", or runs /os-setup.

2026-05-2628

generate-visual.md

from "naveedharri/benai-skills"

Generate on-screen visuals for Ben van Sprundel's YouTube videos using Higgsfield. Same brand system as the thumbnails (charcoal + coral, dot-grid, flat-stylized icons, bold uppercase text), but optimized for in-video slides shown during a tutorial. Use when the user says generate a visual, make a slide, on-screen graphic, video visual, explain this concept visually, progressive disclosure, step-by-step reveal, or shares a concept and asks for a slide to show during their video. Two modes: single (one slide from one prompt) and progressive-disclosure (N sequential frames where each builds on the previous by adding one element at a time, locking background/composition across the entire sequence). Saves per-video to Projects/youtube/{video-slug}/visuals/.

2026-05-1428

os-mcp.md

from "naveedharri/benai-skills"

Deploy a Relay MCP server to the user's own Railway account, giving Claude read/write access to their Obsidian vault via the Relay.md sync protocol. Bundled source ships inside the skill — no separate repo clone needed. The user only pastes a Railway account token; the relay vault and folders inside it are auto-discovered after they OAuth in. Use when the user wants to "set up the os MCP", "deploy relay MCP to railway", "self-host the obsidian MCP server", or "give Claude access to my Obsidian vault".

2026-05-1128

package.json

"author": "naveedharri"

"repository": "naveedharri/benai-skills"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Graphic DesignersArts, Design, Entertainment, Sports, and Media Occupations27-1024L4

name

youtube-thumbnail-generate

description

Generate on-brand YouTube thumbnails for Ben van Sprundel using Higgsfield in one shot. Use when the user says create a thumbnail, make a YT thumbnail, thumbnail for video, generate ben thumbnail, variation of last thumbnail, or shares a video concept and asks for a thumbnail. Auto-infers mode from the inputs (variation, new-with-ben, ben-plus-other, no-face), defaults to 3 variants, and asks at most one question. Uses reference images as the identity anchor (a current photo of Ben from refs/, plus optional past thumbnails or style anchors). No Soul training is required. Reads the locked style spec at Context/youtube-thumbnail-style.md if present. Real logos are never rendered. The thumbnail fills its frame edge-to-edge; the user composites the actual logo on top in post.

YouTube Thumbnail Generate

One-shot thumbnail generation. Takes a concept (plus optional reference image and count), infers everything else, ships 3 variants and a manifest.

Inputs

Two things, ideally both in the user's first message:

Concept — what the thumbnail should show. A sentence or phrase. ("Claude Code Skills, why it changes everything for solo founders")
Reference image(s) (optional but recommended, may be multiple) — a past thumbnail, a photo of Ben, a second subject, a style anchor, a real logo PNG, or any combination. Determines the mode AND ALL of them must be passed into the generation. Never silently drop a user-supplied reference; if they gave you one, it has to end up in medias[]. If they gave you a LIST, read EVERY image in the list, then pass every relevant one to medias[] (don't pre-select just one). Also: every reference image MUST be visually read (via the Read tool on the file path) before the prompt is built, so the prompt captures each reference's actual style, texture, palette, composition, and recurring motifs rather than relying only on the style spec defaults.

If the user mentions a reference image but does not provide a path or attachment, ASK for the path BEFORE doing anything else. Examples that require asking: "use my previous thumbnail as a ref" (which file?), "include the Anthropic logo" (where is the PNG?), "match this style" (which image?). Ask in one short line: "Got it — what's the file path for the reference image?" Do not guess, do not proceed, do not generate without seeing the path.

Optional third: variant count. Defaults to 3. Max 4.

If the concept is missing or genuinely unclear, ask ONE combined question:

"What should the thumbnail show, and how many variations do you want? (default 3)"

Do not split into multiple questions. Do not ask about mode, model, or palette; all of that is inferred or read from the style spec.

Identity Anchor (no Soul required)

Every new-with-ben thumbnail uses a reference photo of Ben as the identity anchor, passed as medias[0]. The photo lives at Projects/youtube/thumbnails/refs/ben_reference_{YYYY-QQ}.jpg.

If the user did NOT attach a photo of Ben and there is no ben_reference_*.jpg in refs/, ask once for a photo. Without it, new-with-ben cannot produce a faithful Ben rendering.

If multiple ben_reference_*.jpg files exist, pick the most recent (highest YYYY-QQ suffix). Note which one was used in the manifest.

Mode Auto-Inference

User attached	Concept hints	Mode	Model	Reference flow
16:9 image that looks like a past thumbnail	"vary this", "tweak this", "redo with X"	`variation`	`nano_banana_2`	past thumbnail as `medias[0]`
A portrait of Ben	Ben centered in the concept	`new-with-ben`	`nano_banana_2`	user-supplied photo as `medias[0]`
Two images (Ben + something)	"Ben plus X"	`ben-plus-other`	`nano_banana_2` (multi-ref)	Ben as `medias[0]`, second subject as `medias[1]`
No image	Object or abstract concept, no Ben	`no-face`	`nano_banana_2` (or `gpt_image_2` if concept centers on rendered text)	optional style anchor as `medias[0]`
No image	Concept mentions Ben	`new-with-ben`	`nano_banana_2`	most recent `ben_reference_*.jpg` from `refs/` as `medias[0]`. If none, ask.

nano_banana_2 is the default model for every mode. Switch to gpt_image_2 only when text rendering is the hero element of a no-face thumbnail.

Style Spec Handling (silent)

Read Context/youtube-thumbnail-style.md if it exists. Pull:

Palette, framing library, expression library, lighting, prohibited list, anchor refs

If the spec is missing or has [FILL] blocks: don't refuse. Fall back to the locked Ben AI thumbnail visual language (see references/visual-language.md for the full catalog). Built-in defaults:

Palette: deep charcoal #1F1F1F background with subtle dot-grid texture, signature coral #E97B5D accent on folders / app icons / asterisk marks, white #FFFFFF for primary text and hand-drawn arrows, near-black #0A0A0A for text on light backgrounds
Layout: Ben on the right third, text and supporting visuals on the left third, one hand-drawn white curved arrow from text toward the visual
Framing: chest-up, Ben on right third
Expression: slight smile (default), focused neutral for analytical topics
Wardrobe: plain black t-shirt or hoodie
Lighting: soft front-left key, gentle rim from behind, no hard shadows
Banned colors: navy blue, royal blue, sky blue, pure red, magenta, purple (except Obsidian purple for Obsidian topics), green, neon variants
Prohibited: real logos rendered by the model, empty rectangles or reserved gaps in the composition, centered composition when Ben is in frame, multiple arrows, cartoon/illustrated rendering of Ben, em dashes
Logos: never reserved as a gap in the render. The thumbnail fills edge-to-edge. Logos are composited on top in Figma or Canva on the winner.

Surface a one-line note at the end (not before generation): "Style spec missing or incomplete; used the locked Ben AI thumbnail visual language." The locked defaults below ARE the source of truth; no separate setup skill is required.

UX Rules

One question max per run, and only if a required input is missing. Default everything else.
No raw IDs in chat. Save job_ids to manifest. Show the user file paths and a one-line summary.
No internal jargon. Don't narrate "inferring mode...", "loading style spec...", "calling generate_image...". Just do it.
Detect language and respond in it. Technical args (hex codes, model names) stay English.
Don't preview the prompt unless the user asks. The 4-block prompt is internal.
Don't suggest mode switches unless generation fails. Trust the inference.

Flow

The whole loop is one chat turn. No intermediate confirmations.

1. Parse user message: extract concept, attached images, count.
2. Infer mode. For new-with-ben without an attached photo, pull the most recent ben_reference_*.jpg from refs/. If missing, ask once.
3. READ every reference image visually using the Read tool on each file path. This is MANDATORY — do not skip even when you think you already know what the image looks like. Read each one. If a path was mentioned but not provided, STOP here and ask for it. Extract per ref: dominant colors with rough hex equivalents, composition pattern, lighting mood, render style (photoreal vs flat-stylized), texture (grain, dot-grid, smooth), recurring motifs visible. State observations briefly in chat (one sentence per ref) so the user can verify you actually read them. Then identify the SHARED style signals across all refs ("all 5 use the dot-grid background and coral folder; 3 use a hand-drawn arrow") — these shared signals are the strongest brand cues to lock into every variant. Pass every relevant ref to medias[] (up to nano_banana_2's 4-ref limit; if more, pick the 4 most representative and tell the user).
4. Load style spec (or fall back to locked defaults).
5. IDEATE N distinct creative angles for the concept (one per requested variant). Each angle should have a different headline, different supporting visual, and a different framing of why-this-topic-matters. Brainstorm in chat in a compact one-line-per-variant list, then proceed. Do NOT generate N near-identical variants of one composition; that's wasted credits.
6. Build N prompts internally (4-block template from references/prompt-builder.md per variant). Across all N prompts, blocks 2 (Subject — Ben's face, expression, framing, wardrobe, camera angle) and 4 (Negatives) stay IDENTICAL. Only blocks 1 (Scene) and 3 (Style — the supporting visual and any motif specifics) change between variants. Fold the visual observations from step 3 into every variant's blocks. Never surface prompts unless asked.
7. Generate N times: one `generate_image` call per variant with `count: 1`, `aspect_ratio: "16:9"`, `resolution: "2k"`. EVERY call passes the same Ben reference photo (or the same primary ref) as `medias[0]` so the face stays locked. Sequential calls so each lands in the manifest in order.
8. Save outputs to Projects/Youtube/thumbnails/generated/{YYYY-MM-DD}-{topic-slug}/ with manifest.md. Manifest lists each variant's distinct angle in plain English.
9. Deliver paths and the one-line logo-composite reminder.

Cost Preflight

Only preflight when:

count > 3, OR
model is gpt_image_2 (highest per-variant cost), OR
estimate exceeds 50 credits

Otherwise generate directly. When preflighting: params.get_cost: true first, show the credit cost, generate on confirm.

Output

After saving, deliver in this shape:

3 thumbnails ready for "claude-code-skills":
- Projects/Youtube/thumbnails/generated/2026-05-14-claude-code-skills/v1.png
- Projects/Youtube/thumbnails/generated/2026-05-14-claude-code-skills/v2.png
- Projects/Youtube/thumbnails/generated/2026-05-14-claude-code-skills/v3.png

Pick the winner, composite the logo from Projects/Youtube/thumbnails/logos/ in Figma or Canva.

No mode label, no model name, no credit count unless the user asks.

Manifest (saved silently)

Every batch writes manifest.md in the output folder with:

---
type: thumbnail-batch
date: {YYYY-MM-DD}
topic: {topic}
mode: {inferred mode}
model: {model_id used}
ben_reference: {filename of ben_reference_*.jpg used, or null if no-face}
variants: {count}
tags: [thumbnail, youtube, {topic-slug}]
status: candidates
---

## Concept Angles (one per variant)

- v1: {headline} — {one-line description of the supporting visual and hook}
- v2: {headline} — {one-line description}
- v3: {headline} — {one-line description}
- ...

## Locked Subject Block (identical across all variants)
{block 2 wording — Ben's face, expression, framing, wardrobe, camera angle}

## Variant Prompts

### v1
{full 4-block prompt sent to Higgsfield for v1}

### v2
{full 4-block prompt sent to Higgsfield for v2}

...

## References
- {description + media_id or job_id of each ref passed to medias[]}

## Job IDs
- v1: {job_id}
- v2: {job_id}
- ...

## Notes
{anything worth remembering for future variations}

This is non-negotiable. Future "vary v2" calls depend on stored job_ids.

Variation Shortcut

If the user says "vary v2 of the last one" or "redo number 3":

Find the most recent manifest.md in generated/.
Read the matching job_id.
Pass that job_id as medias[0] to nano_banana_2.
Build a short edit prompt from the user's change description.
Save to a new dated folder with its own manifest.

No extra questions; the past run is the reference.

Core Rules

Non-negotiable. Numbered for cross-reference.

Prefer real logos passed as reference images. Never let the model hallucinate a brand mark from text alone, and never instruct it to leave an empty rectangle for a logo. When a topic involves a brand mark or logo: first check Projects/youtube/thumbnails/logos/ for a matching real logo PNG. If one exists, pass it as a medias[] entry and instruct the prompt to render that exact mark from the reference. If no logo reference is available, describe the element generically in the prompt (don't name the brand) and tell the user they can composite the real logo in post. Hallucinated brand marks (model inventing the Anthropic asterisk, Claude wordmark, OpenAI logo from text alone) stay banned; real PNG references are encouraged.
Reference images are first-class inputs. The order is strict: READ then UNDERSTAND then BUILD then PASS. This is non-negotiable; the skill has been caught skipping it. When the user provides one or more references: (a) READ each image visually with the Read tool — this is MANDATORY, not optional, even when you "know what it probably looks like." If you don't read it, you don't see it; if you don't see it, the prompt is wrong. (b) UNDERSTAND it — extract dominant colors with rough hex, composition pattern, lighting mood, render style, texture, recurring motifs, AND identify any specific brand marks or icons present. State the observations in chat in one short line per ref so the user can verify you actually read them. (c) BUILD the prompt with those observations folded into blocks 1, 2, and 3. (d) PASS every relevant reference into medias[] of the generate_image call. Don't skip steps; don't generate without reading first. A user-supplied reference is a stronger signal than the style spec; let what you see in the reference override defaults when they conflict. If the user mentions a reference image but doesn't provide a path or attachment, STOP and ask for the path before proceeding. Do not guess, do not proceed without it.
If MULTIPLE references are provided, read ALL of them and pass every relevant one to medias[]. Never pre-select just one when more were given. Order matters: strongest anchor first (medias[0]), then supporting refs. If the user gave more references than nano_banana_2 accepts (4 max), pick the 4 most representative — Ben photo first if present, then refs covering distinct style cues, then any logo PNGs — and tell the user in one line which were used and why.
For new-with-ben mode, always pass a current photo of Ben as the identity anchor. If the user didn't attach one, pull the most recent ben_reference_*.jpg from refs/. If none exists, ask once before generating.
Across all variations of a single batch, Ben's face / expression / framing / wardrobe / camera angle MUST stay identical. Lock block 2 (Subject) of the prompt; reuse the same wording for every variant. The reference photo for Ben is passed as medias[0] to every call so the identity stays consistent. Variation lives in the SCENE and SUPPORTING VISUALS only.
For batches with count > 1, brainstorm N distinct creative angles before generating. Each variant should be a different conceptual hook (different headline, different metaphor, different supporting visual), not a reshuffle of the same composition. Make N separate generate_image calls with count: 1 rather than one call with count: N; the latter produces near-duplicates and wastes credits. See references/variation-ideation.md for the ideation pattern and a worked example.
Keep on-screen text minimal. 2 to 4 words per line, 2 lines maximum. The visual carries the message; text is the punchline, not the explanation. If a concept needs more than 2 lines of text, it's the wrong concept for a thumbnail; tighten the hook. Long text also renders worse — fewer words means cleaner kerning, sharper letterforms, no garbled spillover.
Default model is nano_banana_2 for every mode. Only switch to gpt_image_2 when text rendering is the hero element. Never invent a model ID; stick to references/models.md.
Always save the manifest with each variant's distinct creative angle in plain English. Future runs depend on it.
Never use em dashes. Per CLAUDE.md project rule.
Aspect ratio is always 16:9. No exceptions.
Ask at most one question per run. Default everything else.
Fall back gracefully when style spec is incomplete. Don't refuse; use locked visual-language defaults and flag at the end.

When Things Break

See references/troubleshooting.md for the full catalog. The skill should fail fast and surface a one-line cause to the user, not loop or retry silently.

Common triage:

nsfw or ip_detected rejection → remove brand names or public figures from the concept; retry once.
Model rendered a fake logo or empty rectangle → tighten negatives (no real or rendered logos of any kind, no empty rectangles or reserved gaps); retry.
Ben looks off → confirm the latest ben_reference_*.jpg is in refs/, is recent (under 90 days), and matches the wardrobe rules. Pass it as medias[0]. If still off, ask the user for a better photo.
Style drift → compare to anchor refs in Projects/Youtube/thumbnails/refs/; re-read the style spec.
User-provided reference was ignored → confirm the ref is in medias[] of the call. If the model dropped it, increase its prominence in the prompt (use the provided reference as the visual anchor) and retry.

Progressive Updates

When the user corrects something during a run ("never put me in a suit", "always lean yellow on tutorial topics"), append a dated entry to references/skill-rules.md. After 3 confirmations of the same rule, promote it into Core Rules above.

youtube-thumbnail-generate

More from this repository

More from this repository

YouTube Thumbnail Generate

Inputs

Identity Anchor (no Soul required)

Mode Auto-Inference

Style Spec Handling (silent)

UX Rules

Flow

Cost Preflight

Output

Manifest (saved silently)

Variation Shortcut

Core Rules

When Things Break

Progressive Updates

YouTube Thumbnail Generate

Inputs

Identity Anchor (no Soul required)

Mode Auto-Inference

Style Spec Handling (silent)

UX Rules

Flow

Cost Preflight

Output

Manifest (saved silently)

Variation Shortcut

Core Rules

When Things Break

Progressive Updates