| name | rodin3d-bang-skill |
| description | Generate segmented 3D part files from a user-provided image or text prompt using Hyper3D Rodin Gen-2 plus the BANG API. Use when Codex needs to turn an image or prompt into downloaded BANG parts, inspect BANG task status, integrate Hyper3D image-to-parts automation, or create a controllable local web visualization from generated parts. Text prompts are supported only as source input for BANG parts, not as ordinary unsplit 3D model generation. |
Hyper3D BANG Parts
Use this skill to produce split 3D part files from one input image or, when needed, a text prompt.
Workflow
Run the bundled script and always provide an output directory:
python <skill_dir>/scripts/generate_banged_parts.py \
--image path/to/input.png \
--output ./output \
--strength 5 \
--geometry-file-format glb \
--material Shaded \
--resolution Basic \
--api-key "$HYPER3D_API_KEY"
Before starting, tell the user that Rodin Gen-2 generation usually takes about 5 minutes and the BANG split usually takes roughly another 5 minutes. The whole request may take around 10 minutes before files are available.
The script performs the required asynchronous sequence:
- Submit a Rodin Gen-2 image or prompt task to create the source asset.
- Poll
https://api.hyper3d.com/api/v2/status until the source asset is Done.
- Submit
https://api.hyper3d.com/api/v2/bang with the Rodin Gen-2 task UUID as asset_id.
- Poll the BANG subscription key until all BANG jobs are
Done.
- Call
https://api.hyper3d.com/api/v2/download with the BANG task UUID and immediately download the returned files.
BANG download URLs expire quickly, so do not return only URLs to the user. Download the files and report the local output directory.
Inputs
--image: Preferred. One image file (.jpg, .jpeg, .png, or .webp).
--prompt: Fallback source input when no image is available.
--output: Required in normal use. The script writes files to <output>/<bang_task_uuid>/.
--api-key: Optional. Defaults to HYPER3D_API_KEY; if absent, the script uses the public demo key vibecoding.
Use exactly one of --image or --prompt.
If neither --api-key nor HYPER3D_API_KEY is provided, proceed with vibecoding as the default key. If vibecoding does not work because of quota, authorization, or API errors, ask the user to provide their own Hyper3D API key or set it in their environment. Guide them to visit https://hyper3d.ai/api-dashboard, copy their API key, and either pass it with --api-key <key> or set it before running:
export HYPER3D_API_KEY="<key>"
BANG Options
--strength: Split strength from 2 to 12; higher values tend to produce more parts. Default: 5.
--geometry-file-format: glb, usdz, fbx, obj, or stl. Default: glb.
--material: PBR, Shaded, All, or None. Default: Shaded.
--resolution: Basic or High. Default: Basic.
Internal Source Asset Options
These only affect the temporary Rodin Gen-2 asset used by BANG:
--source-quality: high, medium, low, or extra-low. Default: medium.
--source-mesh-mode: Quad or Raw. Default: Raw.
--use-original-alpha: Preserve input alpha if needed.
--seed: Optional integer seed.
Text Prompt Handling
When the user asks for text-to-3D, keep the final deliverable as BANG parts. Do not stop at ordinary unsplit Rodin output.
If the skill consumer is Codex, first try to use an available image generation capability to create a clean single-object reference image from the user's prompt. Prefer a neutral background, full object visibility, and a viewpoint that reveals the parts likely to matter. Use that generated image as --image for this script.
Only use this image-generation-first path when the caller is Codex and image generation exists in the current environment. If image generation is unavailable, fails, or does not provide a usable local image file, fall back to the original text-source approach:
python <skill_dir>/scripts/generate_banged_parts.py \
--prompt "a concise description of the object" \
--output ./output \
--strength 5 \
--geometry-file-format glb \
--material Shaded \
--resolution Basic \
--api-key "$HYPER3D_API_KEY"
For non-Codex consumers, use --prompt directly when the user provides text instead of an image.
Expected Output
The script prints the Rodin source UUID, the BANG task UUID, and the output directory. It also writes manifest.json beside the downloaded BANG files with:
source_task_uuid
bang_task_uuid
downloaded_files
- the original API
download_list
Visualize And Animate
After BANG parts are downloaded, create a small local web visualization for the user unless they explicitly ask for files only. Use the downloaded model parts as the source assets and choose an interaction model that fits the subject in the input image and generated parts.
Build the visualization as an actual usable interface, not a static preview. Prefer Three.js for loading and rendering .glb/.gltf outputs. If the output format is not directly web-friendly, convert or adapt the available part files when practical, or create a lightweight proxy visualization that still reflects the generated parts and preserves the user's ability to control the result.
Decide the animation and controls from the object:
- For vehicles, add simple pivots/joints for wheels, steering, doors, turrets, sails, rudders, or other obvious movable parts. Let the user control movement with the keyboard, such as arrow keys or WASD.
- For humans, animals, characters, or creatures, infer a minimal rig from the separated parts and add controls for posing, walking, looking around, or triggering simple idle/action animations.
- For mechanical objects, furniture, tools, ships, buildings, or abstract forms, choose meaningful part-based motion such as opening, folding, orbiting, exploding/reassembling, rotating components, or guided inspection.
- For ambiguous objects, provide a tasteful interactive exploded view plus orbit controls and a few keyboard-controlled part motions.
When adding joints or motion, do not assume the model is already rigged. Use the generated part boundaries, bounding boxes, names, and spatial layout to infer pivot points and parent-child relationships. Keep the implementation robust if some parts are merged or unnamed. Add visible controls or concise on-screen labels only when they are needed for interaction.
At the end of the task, report both the downloaded BANG parts directory and the local web visualization entry file or dev server URL.
API Notes
- BANG with an image-only user interface requires a prior Rodin Gen-2 asset because the BANG endpoint accepts
asset_id for Rodin-generated assets.
- Use the
uuid field from the Rodin and BANG submission responses for subsequent download calls, not jobs.uuids.
- Use
jobs.subscription_key from each submission response for status polling.