تشغيل أي مهارة في Manus بنقرة واحدة

capy-video-gen-skill

Multi-shot AI video generation pipeline with face identity consistency. Converts scripts or ideas into complete videos using character extraction, storyboarding, frame generation, and video assembly. 300 experiments validated, 70% face distance improvement. Use when the user asks to create a video from a script, story, idea, or wants multi-shot video with consistent characters.

تشغيل في Manus

نظرة عامة

أمر التثبيت

npx skills add https://github.com/happycapy-ai/Happycapy-skills --skill capy-video-gen-skill

انسخ والصق هذا الأمر في Claude Code لتثبيت المهارة

المصدر

happycapy-ai/Happycapy-skills

النجوم١٢٣

التفرعات٢٣

آخر تحديث٢٠ مارس ٢٠٢٦ في ١٢:٥٥

مستكشف الملفات

69 ملفات

SKILL.md

readonly

المزيد من هذا المستودع

نفس المستودع

mobile-app-developer

happycapy-ai/Happycapy-skills

End-to-end mobile app development and publishing using Expo + EAS. Covers project scaffolding, asset preparation, peer-dep / lock-file troubleshooting (development.md), web preview with phone frame and Expo Go QR code (preview.md), AND automated build + submit to TestFlight / App Store (publishing.md), with full automation via Apple ASC API Key (no Mac, no Xcode, no 2FA required). Use when the user wants to build, test, preview, publish, or automate the release pipeline of a React Native / Expo iOS or Android app.

2026-05-27123

generate-image

happycapy-ai/Happycapy-skills

Generate and transform images using AI Gateway API. Use when the user asks to create, generate, produce, or transform images, or work with image generation.

2026-05-20123

html-over-markdown

happycapy-ai/Happycapy-skills

Generate rich, self-contained HTML documents instead of Markdown when output needs visual hierarchy, diagrams, or interactivity. Use for specs, implementation plans, side-by-side design comparisons, PR writeups and code explainers, research and status and incident reports, slide decks, SVG flowcharts, and throwaway editors like triage boards, feature-flag editors, and prompt tuners. Prefer this skill whenever the user asks for a report, plan, or explainer they'll actually want to read — even if they don't explicitly say "HTML".

2026-05-09123

360-panorama-viewer

happycapy-ai/Happycapy-skills

Build a fully self-contained 360° equirectangular panorama viewer as a single HTML file. The viewer uses Three.js to render immersive spherical panoramas with drag-to-look, zoom, auto-rotate, and a scene-switcher sidebar. All panorama images are embedded as base64 JPEG — no server needed. Use this skill whenever the user asks to create a 360 viewer, VR panorama app, immersive scene gallery, equirectangular image viewer, or wants to combine multiple AI-generated panoramas into an interactive webpage. Also trigger when the user says things like "make a 360 viewer", "VR world gallery", "360度全景", "全景查看器", "make scenes I can look around in", etc.

2026-04-24123

happycapy-social-publisher

happycapy-ai/Happycapy-skills

HappyCapy-specific skill for publishing content to 13+ social media platforms (Instagram, Twitter, LinkedIn, Threads, Facebook, TikTok, YouTube, Pinterest, Reddit, Telegram, Discord, etc.) simultaneously with platform-optimized styles, optional AI-generated media (video/image), and smart error handling. Uses Late MCP integration available in HappyCapy environment. Use when you need to cross-post to social media, create multi-platform marketing content, share announcements across platforms, publish with platform-specific adaptations, generate AI media for posts, or manage social media publishing workflows. Supports interactive content creation with user-guided platform selection, media generation choices, preview before publish, and automatic retry with character limit adjustments.

2026-03-21123

happycapy-feishu

happycapy-ai/Happycapy-skills

为 HappyCapy 安装并授权飞书（Lark）MCP，让 Claude 直接操作飞书消息、文档、多维表格、日历等。当用户提到安装飞书 MCP、配置飞书、接入飞书、飞书 MCP setup、connect feishu/lark、飞书重新授权、飞书 token 过期、lark mcp 失效等场景时，必须使用此 skill。

2026-03-20123

المصدر

happycapy-ai

happycapy-ai/Happycapy-skills

فتح مستودع GitHub عرض مستودعات المنشئ

أمر التثبيت

تنزيل

تشغيل في Manus

مفيد لـSOC

مطوّرو البرمجياتمهن الحاسوب والرياضيات15-1252L4

name	capy-video-gen-skill
description	Multi-shot AI video generation pipeline with face identity consistency. Converts scripts or ideas into complete videos using character extraction, storyboarding, frame generation, and video assembly. 300 experiments validated, 70% face distance improvement. Use when the user asks to create a video from a script, story, idea, or wants multi-shot video with consistent characters.
allowed-tools	Bash, Read, Write, Edit

Capy Video Gen Skill - Script-to-Video Pipeline

Generate complete multi-shot videos from scripts or ideas with consistent character faces across all scenes. Built for HappyCapy AI Gateway. 300 experiments validated, 70% face distance improvement.

Overview

ViMax converts text scripts into full videos through an automated pipeline:

Extract characters from script with detailed physical features
Generate front/side/back character portraits
Design shot-by-shot storyboard
Decompose each shot into first_frame, last_frame, and motion descriptions
Build camera tree for shot relationships
Generate frames with reference image selection (face identity as top priority)
Generate video clips from frames
Concatenate into final video

Installation Location

The ViMax pipeline code is at: /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax/

All commands must be run from this directory using the venv:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax

Prerequisites

AI_GATEWAY_API_KEY environment variable (auto-configured in HappyCapy)
Python venv at .venv/ (already set up)

Quick Start

Script-to-Video

Edit the script, requirements, and style in the entry script, then run:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_script2video.py

Idea-to-Video

For generating from a brief idea (auto-generates script first):

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_idea2video.py

Programmatic Usage

import asyncio
from langchain.chat_models import init_chat_model
from tools.render_backend import RenderBackend
from utils.config_loader import load_config
from pipelines.script2video_pipeline import Script2VideoPipeline

config = load_config("configs/happycapy_script2video.yaml")
chat_model = init_chat_model(**config["chat_model"]["init_args"])
backend = RenderBackend.from_config(config)

pipeline = Script2VideoPipeline(
    chat_model=chat_model,
    image_generator=backend.image_generator,
    video_generator=backend.video_generator,
    working_dir=config["working_dir"],
)

# Run the pipeline
asyncio.run(pipeline(
    script="Your script here...",
    user_requirement="No more than 8 shots total.",
    style="Cinematic, warm lighting"
))

Pipelines

Script2VideoPipeline

Input: A formatted screenplay/script with character dialogue and scene descriptions
Output: Concatenated video at {working_dir}/final_video.mp4
Config: configs/happycapy_script2video.yaml

Idea2VideoPipeline

Input: A brief idea/concept (1-3 paragraphs)
Output: Auto-generates a script, then produces video
Config: configs/happycapy_idea2video.yaml

Configuration

HappyCapy configs at configs/happycapy_script2video.yaml:

chat_model:
  init_args:
    model: gpt-4.1
    model_provider: openai
    api_key: ${AI_GATEWAY_API_KEY}
    base_url: https://ai-gateway.happycapy.ai/api/v1/openai/v1

image_generator:
  class_path: tools.ImageGeneratorHappyCapyAPI
  init_args:
    api_key: ${AI_GATEWAY_API_KEY}
    model: google/gemini-3.1-flash-image-preview

video_generator:
  class_path: tools.VideoGeneratorHappyCapyAPI
  init_args:
    api_key: ${AI_GATEWAY_API_KEY}
    model: google/veo-3.1-generate-preview

working_dir: .working_dir/script2video

Key Components

Agents (AI Processing)

Agent	File	Purpose
CharacterExtractor	`agents/character_extractor.py`	Extract characters with static/dynamic features from script
CharacterPortraitsGenerator	`agents/character_portraits_generator.py`	Generate front/side/back portraits for each character
StoryboardArtist	`agents/storyboard_artist.py`	Design shot-by-shot storyboard with first/last frames and motion
ReferenceImageSelector	`agents/reference_image_selector.py`	Select best reference images for each frame (face identity #1 priority)
CameraImageGenerator	`agents/camera_image_generator.py`	Build camera trees and generate transition videos
BestImageSelector	`agents/best_image_selector.py`	Select best generated image from candidates
Screenwriter	`agents/screenwriter.py`	Generate scripts from ideas

Tools (Generation Backends)

Tool	File	Purpose
ImageGeneratorHappyCapyAPI	`tools/image_generator_happycapy_api.py`	Image generation via HappyCapy Gateway (Gemini)
VideoGeneratorHappyCapyAPI	`tools/video_generator_happycapy_api.py`	Video generation via HappyCapy Gateway (Veo)
RenderBackend	`tools/render_backend.py`	Factory for instantiating generators from config

Interfaces (Data Models)

CharacterInScene - Character with identifier, static_features, dynamic_features
ShotDescription - Shot with ff_desc, lf_desc, motion_desc, variation_type
Camera - Camera with parent-child relationships
Frame - Frame with shot_idx, frame_type, visible characters
ImageOutput / VideoOutput - Generation outputs with save methods

Face Identity Consistency (CRITICAL)

This pipeline includes face identity improvements validated through 257 experiments (70% improvement in face distance, from 0.74 to 0.22):

Built-In Protections

Reference Image Selector: Face identity is the #1 priority when selecting reference images. The front-view portrait is always included when a character's face is visible.
Character Portraits: Enhanced prompts generate identity-critical details (exact nose shape, eye spacing, jawline, distinguishing marks) for cross-scene recognition.
Video Prompt Face Lock: Every video generation prompt is prepended with a face identity instruction requiring the character's face to remain identical to the starting frame throughout the clip.

Best Practices When Using ViMax

Hyper-detailed character descriptions: Include ethnicity, age, hair texture/style/color, eye shape, facial hair, glasses, skin tone, build, and distinguishing marks in your script's character introductions
Extreme close-up shots: Include at least one extreme close-up per character to anchor identity
Consistent lighting: Specify similar lighting across scenes to prevent face drift
User-provided reference photos: Place photos in the working directory and pass them as character_portraits_registry to skip AI portrait generation

What Does NOT Work

Complex prompt engineering (viseme morphing, phoneme anchoring) does not improve face identity
Simple, direct prompts with detailed physical descriptions outperform clever prompts
Lip-sync to external audio is NOT possible (Veo generates its own internal audio)

See FACE_IDENTITY_GUIDE.md in the ViMax directory for full details.

Output Structure

After a run, the working directory contains:

.working_dir/script2video/
  characters.json                      # Extracted characters
  character_portraits_registry.json    # Portrait paths registry
  character_portraits/                 # Generated portraits
    0_CharacterName/
      front.png
      side.png
      back.png
  storyboard.json                     # Shot descriptions
  camera_tree.json                    # Camera relationships
  shots/
    0/
      shot_description.json
      first_frame.png
      last_frame.png (if medium/large variation)
      video.mp4
    1/
      ...
  final_video.mp4                     # Final concatenated output

Customization

Using Your Own Reference Photos

To use real photos instead of AI-generated portraits:

# Build a portrait registry pointing to your photos
character_portraits_registry = {
    "Alice": {
        "front": {"path": "/path/to/alice_front.png", "description": "Front view of Alice"},
        "side": {"path": "/path/to/alice_side.png", "description": "Side view of Alice"},
        "back": {"path": "/path/to/alice_back.png", "description": "Back view of Alice"},
    }
}

# Pass to pipeline (skips portrait generation)
await pipeline(
    script=script,
    user_requirement=user_requirement,
    style=style,
    character_portraits_registry=character_portraits_registry,
)

Changing Models

Edit the YAML config to use different models:

Image: google/gemini-3.1-flash-image-preview (recommended for face identity)
Video: google/veo-3.1-generate-preview (recommended) or openai/sora-2
Chat: gpt-4.1 (recommended) or any OpenAI-compatible model

Troubleshooting

"No module named 'tools'" or similar import errors

Run from the ViMax root directory:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_script2video.py

API rate limit errors

Reduce max_requests_per_minute in the YAML config.

Face identity drift in generated videos

Add more physical detail to character descriptions in your script
Use user-provided reference photos instead of AI-generated portraits
Include extreme close-up shots for important characters
Keep lighting consistent across scenes