con un clic
robolab-scenegen
// Generate USD scene files for RoboLab from natural language descriptions. Use this skill when a user wants to create a new scene with objects on a table, or asks to arrange objects for a robot manipulation task.
// Generate USD scene files for RoboLab from natural language descriptions. Use this skill when a user wants to create a new scene with objects on a table, or asks to arrange objects for a robot manipulation task.
| name | robolab-scenegen |
| description | Generate USD scene files for RoboLab from natural language descriptions. Use this skill when a user wants to create a new scene with objects on a table, or asks to arrange objects for a robot manipulation task. |
| license | CC-BY-NC-4.0 |
| compatibility | Requires a RoboLab project with assets/scenes/base_empty.usda and assets/objects/object_catalog.json. |
| metadata | {"author":"nvidia","version":"1.0.0"} |
Generate USD scene files (.usda) from natural language descriptions of tabletop arrangements.
A scene is a USDA file that places objects on the table in base_empty.usda. Scenes are robot-agnostic — they define only what objects exist and where they are positioned.
The references/ directory contains detailed documentation loaded on-demand:
references/scene_format.md — USD scene format, coordinate system, placement rules, and examplesreferences/object_catalog_guide.md — Object catalog structure, classes, and naming conventionsreferences/predicates.md — Predicate types, JSON format, and common patterns for the solver pipelineassets/scenes/base_empty.usda exists (the base scene template)assets/objects/object_catalog.json exists (the object catalog)When the user invokes this skill, display the following message verbatim:
I'll help you generate a USD scene file. I need a few things:
.usda (e.g., fruits_bowl.usda, kitchen_sorting.usda). Use snake_case.assets/scenes/ (standard location)assets/scenes/generated/ (for generated scenes)Objects I'll pick from:
I use the object catalog at assets/objects/object_catalog.json (312 objects) built from these directories:
| Directory | Count | Examples |
|---|---|---|
assets/objects/vomp/ | 196 | containers, bins, totes, crates, kitchenware |
assets/objects/hope/ | 27 | condiments, canned goods, dairy, pantry items |
assets/objects/hot3d/ | 25 | mugs, bowls, electronics, toys |
assets/objects/ycb/ | 22 | fruits, bowls, cans, tools |
assets/objects/handal/ | 19 | hammers, spoons, utensils, ladles |
assets/objects/objaverse/ | 10 | bagels, bread, dumplings |
assets/objects/fruits_veggies/ | 9 | avocado, lemon, lime, orange, onion |
assets/objects/basic/ | 4 | colored blocks (red, blue, green, yellow) |
If the objects you need aren't in the catalog, you can:
pxr/IsaacSim installed)After receiving the user's input:
.usda. If not, append .usda and confirm with the user.assets/objects/object_catalog.json..usd/.usda files and include them as additional objects. For each custom object, extract the prim name from the filename and compute dimensions if possible (or ask the user).<ISAACSIM_PYTHON> assets/objects/_utils/generate_catalog.py, then re-read the catalog.assets/objects/object_catalog.json" or "Using 312 default objects + 5 custom objects from /path/to/custom/").references/scene_format.md for the exact USDA format, coordinate system, and placement rules.name values from the catalog.usd_path from catalog (e.g., assets/objects/ycb/banana.usd) to a path relative to the scene's output directory (e.g., ../objects/ycb/banana.usd for assets/scenes/, or ../../objects/ycb/banana.usd for assets/scenes/generated/)The skill uses the same predicate-based solver from robolab/scene_gen/llm_scene_gen/ to compute collision-free placements. This is a 4-step pipeline that does not require IsaacSim — it runs with pure Python + numpy + scipy.
Based on the user's description, generate a JSON structure with object selections and predicates. This is the same format the LLM agent on xuning/scene_task_gen uses:
{
"objects": [
{"name": "bowl"},
{"name": "banana"},
{"name": "orange_01"}
],
"predicates": [
{"type": "place-on-base", "object": "bowl", "x": 0.55, "y": 0.0},
{"type": "random-rot", "object": "bowl"},
{"type": "place-on-base", "object": "banana", "x": 0.40, "y": -0.20},
{"type": "random-rot", "object": "banana"},
{"type": "place-on-base", "object": "orange_01", "x": 0.70, "y": 0.15},
{"type": "random-rot", "object": "orange_01"}
]
}
Spatial (2D placement on table):
place-on-base — Place on table at explicit (x, y). Params: object, x, yleft-of / right-of / front-of / back-of — Relative to another object. Params: object, reference, distancerandom-rot — Random yaw rotation. Params: objectfacing-left / facing-right / facing-front / facing-back — Oriented direction. Params: objectalign-left / align-right / align-center-lr / align-center-fb — Alignment. Params: object, referencePhysical (3D placement):
place-on — Stack on top of another object. Params: object, supportplace-in — Place inside a container. Params: objects (list), containerEvery object needs at least place-on-base + random-rot (2 predicates).
Run a Python script via Bash that:
ObjectState objects using parse_predicates_from_dictUse this exact script pattern:
import json, sys
from robolab.scene_gen.llm_scene_gen import (
ObjectState, PlaceOnBasePredicate, parse_predicates_from_dict,
SpatialSolver, PhysicalSolver, FeedbackSystem
)
catalog = json.load(open('assets/objects/object_catalog.json'))
llm_result = json.loads(sys.argv[1]) # Pass predicates JSON as argument
# Parse objects
object_states = {}
object_info = {}
for obj_data in llm_result['objects']:
name = obj_data['name']
entry = next((o for o in catalog if o['name'] == name), None)
if entry:
object_info[name] = entry
object_states[name] = ObjectState(name=name)
# Parse predicates
for pred_data in llm_result['predicates']:
pred = parse_predicates_from_dict(pred_data)
if pred.target_object in object_states:
object_states[pred.target_object].predicates.append(pred)
# Add default placement for objects without predicates
for name, state in object_states.items():
if not state.predicates:
state.predicates.append(PlaceOnBasePredicate(name))
# Solve spatial constraints
solver = SpatialSolver(table_bounds=(0.25, 0.85, -0.45, 0.45))
dims = {n: tuple(info['dims']) for n, info in object_info.items()}
success, msg = solver.solve(object_states, dims)
if not success:
print(json.dumps({"error": f"Spatial solver failed: {msg}"}))
sys.exit(1)
# Solve physical constraints
phys = PhysicalSolver()
phys_success, phys_msg = phys.solve(
object_states, dims,
{n: info['usd_path'] for n, info in object_info.items()},
'assets/scenes/base_empty.usda'
)
# Check grammar
feedback = FeedbackSystem.generate_grammar_feedback(object_states)
if feedback:
print(json.dumps({"error": f"Grammar issues: {feedback}"}))
sys.exit(1)
# Output results
results = {}
for name, state in object_states.items():
results[name] = {
"x": state.x, "y": state.y,
"z": object_info[name]['dims'][2] / 2 + 0.002,
"yaw": state.yaw or 0.0,
"usd_path": object_info[name]['usd_path']
}
print(json.dumps(results, indent=2))
If the spatial solver fails (returns an error), this means objects can't fit without colliding. Options:
If the solver succeeds, proceed to generate the USDA file using the solved positions.
Read the entire assets/scenes/base_empty.usda file. This is your template.
For each object with solved positions from the predicate solver, create a prim block:
def "<object_name>" (
prepend payload = @<relative_path_to_usd>@
)
{
quatf xformOp:orient = (1, 0, 0, 0)
float3 xformOp:scale = (1, 1, 1)
double3 xformOp:translate = (<x>, <y>, <z>)
uniform token[] xformOpOrder = ["xformOp:translate", "xformOp:orient", "xformOp:scale"]
}
Where:
<object_name> = the name field from the catalog (e.g., banana, bowl)<relative_path_to_usd> = the usd_path converted to be relative from the scene's directory<x>, <y> = solved positions from the spatial solver<z> = dims[2] / 2 + 0.002 (from the solver output)base_empty.usda lives at assets/scenes/base_empty.usda and uses relative payloads like @../fixtures/table_oak.usd@ and @../fixtures/franka_table.usd@. Those paths resolve correctly only when the target scene also sits at assets/scenes/ (depth 1 from assets/).
If the target scene is one directory deeper (e.g. assets/scenes/generated/foo.usda or assets/scenes/wip480/foo.usda, depth 2), you must prepend one extra ../ to every inherited payload before inserting objects — otherwise the table and franka_table won't load, and physics will let objects fall through to the ground plane during settle. This is silent: the scene still opens, but the table is invisible and objects end up on the floor at z ≈ -0.67.
General rule: for a scene at depth N from assets/, prepend (N − 1) extra ../ segments to each @../...@ payload inherited from base_empty.
Example — for a scene in assets/scenes/wip480/:
@../fixtures/table_oak.usd@ → @../../fixtures/table_oak.usd@@../fixtures/franka_table.usd@ → @../../fixtures/franka_table.usd@A simple regex substitution over the base content before insertion is sufficient:
import re
if scene_depth > 1:
prefix = "../" * (scene_depth - 1)
base = re.sub(r"@(\.\./)", lambda m: "@" + prefix + m.group(1), base)
Take the (possibly path-rewritten) base_empty.usda content and insert the object prim blocks just before the final } that closes the def Xform "world" block.
Write the complete USDA content to the output path using the Write tool.
If the scene needs multiple objects of the same type (e.g., two bananas), append a suffix:
bananabanana_1banana_2Use the same payload path for all instances.
After writing the USDA file, automatically run settle + screenshot:
The settle and screenshot scripts require IsaacSim (isaacsim package) and pxr (Pixar USD bindings), which are typically installed in a specific conda environment — not the system Python or the project .venv.
To find the right interpreter, search for a conda env that has isaacsim:
for env_dir in ~/miniforge3/envs/*/bin ~/miniconda3/envs/*/bin ~/anaconda3/envs/*/bin; do
if [ -f "$env_dir/python" ]; then
"$env_dir/python" -c "import isaacsim" 2>/dev/null && echo "Found: $env_dir/python" && break
fi
done
If no conda env is found, fall back to checking .venv/bin/python or ask the user which Python has IsaacSim installed.
Cache the discovered interpreter path for the rest of the session.
Run the settle script to let objects fall into stable resting positions. Use --replace so the settled positions are written back into the same file, and --screenshot to capture a rendered image:
<ISAACSIM_PYTHON> assets/scenes/_utils/settle_scenes.py \
--scene <scene_path> \
--replace \
--screenshot \
--screenshot-dir <output_dir>/_images
Where <ISAACSIM_PYTHON> is the interpreter found in Step 0.
--replace flag overwrites the original USDA with settled positions.--screenshot flag renders a 640x480 image and saves it as <scene_name>.png in the screenshot directory.--replace when settling scenes that live in a subdirectory. Without it, settle_scenes.py writes the settled file to SCENE_DIR/<name>.usda (i.e. assets/scenes/<name>.usda), not back into the subdirectory — which loses the depth-corrected payload paths and screenshots render empty.Sanity-check after settle: open one settled .usda and verify at least one object's xformOp:translate has z close to the expected dims[2] / 2 (e.g. 0.02–0.10). If objects show z ≈ -0.67, they fell through to the ground plane — almost always because the table payload path was wrong during settle (see Step 3 above). Fix the payload paths in the source USDAs, then re-settle; re-running settle on already-fallen objects will not move them back onto the table.
After the settle script completes, read the generated screenshot image using the Read tool and display it to the user:
Read: <output_dir>/_images/<scene_name>.png
After showing the screenshot, display the following message to the user:
Scene settled and screenshot generated!
<scene_path> (objects now in stable resting positions)<output_dir>/_images/<scene_name>.pngNext steps:
1. Create a task — use /robolab-taskgen to create a task file that references this scene.
2. Verify in simulation — run the empty demo (no policy) once you have a task:
python examples/run_empty.py --task <TaskClassName>
Scene filenames should be descriptive and use snake_case:
banana_bowl.usda — banana and bowlfruits_plate_sorting.usda — fruits on a plate for sortingblocks_3_stacking.usda — 3 blocks for stackingBefore writing the file, verify:
../objects/ for assets/scenes/, or ../../objects/ for assets/scenes/generated/)#usda 1.0 and has defaultPrim = "world"The payload path in the USDA must be relative from the scene file's directory to the object USD. The usd_path in the catalog is relative to the repo root.
assets/scenes/: payload = @../<usd_path minus "assets/">@ (e.g., @../objects/ycb/bowl.usd@)assets/scenes/generated/: payload = @../../<usd_path minus "assets/">@ (e.g., @../../objects/ycb/bowl.usd@)assets/scenes/foo/bar/: count directory depth from assets/ and prepend the right number of ../General formula: strip assets/ prefix from usd_path, then prepend ../ repeated N times where N = depth of scene directory relative to assets/.