Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

rerun-data-model

Estrellas1

Forks0

Actualizado17 de junio de 2026, 23:42

How raw multimodal robot data maps onto the Rerun data model. Read FIRST, before modeling or converting a dataset. Resolves the entity-vs-component, property-vs-component-vs-layer, and static-vs-temporal decisions, then points at the mechanism skills (rerun-chunk-processing and the importer skills rerun-mcap, rerun-urdf, rerun-parquet, rerun-lerobot) for the how.

Instalación

Instalar con Codex o Claude Copia este prompt, pégalo en Codex, Claude u otro asistente, y deja que revise la página de la skill y la instale por ti.

Ejecutar en Manus

Fuente

rerun-io

rerun-io/trossen-oss

Abrir repositorio de GitHub Ver repositorios del creador

Descarga

Ejecutar en Manus

SKILL.md

readonly

name	rerun-data-model
description	How raw multimodal robot data maps onto the Rerun data model. Read FIRST, before modeling or converting a dataset. Resolves the entity-vs-component, property-vs-component-vs-layer, and static-vs-temporal decisions, then points at the mechanism skills (rerun-chunk-processing and the importer skills rerun-mcap, rerun-urdf, rerun-parquet, rerun-lerobot) for the how.
user_invocable	true
allowed-tools	Read, Grep, Bash, WebFetch

Rerun Data Model

The hard part of ingesting a dataset is the modeling decision, not the API call. Get the model right and any mechanism works; get it wrong and queries, views, and training all break. This skill is just the decisions. For mechanism details see rerun-chunk-processing (pipeline mechanics) and the importer skills it routes to: rerun-mcap, rerun-urdf, rerun-parquet, rerun-lerobot. For exact signatures, the docs at rerun.io/docs/concepts/logging-and-ingestion.

Before writing conversion code, fill in the mapping table below. It is the design, and a human can review it in seconds.

The model

Dataset → Segment → Layer → Recording → Entity → Component → Chunk

Entity = a thing, named by a path (/robot/arm/camera). Component = one typed field on it (positions, image, a scalar). Archetype = a builder for a standard set of components the viewer knows how to render.
Every datum sits on two axes: where (entity path + component) and when (which timeline, or static = all time).
Segment = one episode. On registration recording_id becomes the segment_id. Layer = an extra .rrd on top of a segment; it attaches only by sharing the segment's recording_id. This one rule drives all layering.

The decisions

Entity vs component on an existing entity New entity if it has its own spatial frame (Transform3D/Pinhole), its own annotation context, or should be shown/cleared/shared independently (each robot link, each camera, each sensor). Same entity + extra component for auxiliary data at the same instances (per-point confidence). Use AnyValues for non-standard fields.

Property vs component vs layer (the most common confusion)

Component: per-timestamp signal on the timeline (joint angle, image).
Segment property: one-per-episode metadata for catalog filter/search (operator, robot, site, task, date). Not per timestamp.
Layer: a whole derived .rrd over a base segment (FK transforms, point clouds, gripper state, labels, quality scores). Queryable as if part of base.

Static vs temporal Static belongs to all timelines and shadows any temporal value of the same component on the same entity. Use it for invariants (calibration, coordinate frames, robot meshes, annotation context, a video asset). Never make per-frame data static.

Which timeline A timestamp timeline (ns since epoch) for cross-sensor clock alignment, a sequence timeline for frame/ordinal alignment; stamp on both when useful. Do not resample to a common rate. Latest-at reconciles multi-rate streams at query time by holding each component's last sample (no interpolation).

Base vs layer Base = faithful conversion of the raw streams, nothing computed. Layer = anything derived (FK from URDF + joint states, clouds from depth + intrinsics). Keep them separate .rrds.

The mapping table (produce before coding)

Source (topic/column/key)	Entity path	Archetype	Component(s)	Timeline	Static/temporal	Base/layer	Property?
mcap `/joint_states`	`/robot/<joint>`	`Scalars`	`scalars`	`sensor_time`	temporal	base	no
`cam0/color.mp4`	`/camera/cam0/video`	`AssetVideo`+`VideoFrameReference`	asset+refs	`video_time`	asset static, refs temporal	base	no
`calibration.json`	`/camera/cam0`	`Pinhole` (+`Transform3D` extrinsics)	`image_from_camera`	—	static	base	no
URDF + joints (computed)	`/robot/<link>`	`Transform3D`	translation/rotation	`sensor_time`	temporal	layer	no
`episode.json` operator	segment	—	—	—	—	—	yes

Patterns worth knowing

Transforms / FK trees: log a Transform3D per link entity; it relates to the parent path and composes down the tree. (Named CoordinateFrame + child_frame/parent_frame only when topology must be a flexible graph.)
Cameras: extrinsics (Transform3D) + intrinsics (Pinhole) on the camera entity, image/depth as children so they inherit the projection. DepthImage needs DepthMeter (units-per-value).
Video: AssetVideo (MP4, log static) + VideoFrameReference for per-frame timing, or VideoStream for raw H.264/H.265 samples.
Columnar ingest: for existing datasets use send_columns, not a per-row log loop. It adds no automatic timelines, so pass every timeline you want.

Gotchas that cause real failures

Component columns come back as ListArray in queries: index [0]/[0][0] (0-based DataFrame, 1-based SQL). See rerun-catalog-queries.
A layer must share the segment's recording_id, or it won't attach. application_id is discarded on registration.
send_columns/send_chunks add no log_time/log_tick; only the timelines you pass exist.
Static shadows all temporal data of that component for all time; static is overwritten in the viewer but every write stays on disk until rerun rrd optimize.
One Transform3D/Pinhole relation per frame pair; logging the same relation on a second entity is rejected.
Entity paths are not file paths (.. is meaningless, __ is reserved).

Verify before relying on these (not in the concept docs)

DynamicArchetype.columns(...) and inplace_compaction() are not in the concept docs; confirm against the installed package (this skill set was verified against rerun-sdk 0.34.0a1) or the API reference. The parquet loader is rerun.experimental.ParquetReader (see rerun-parquet).

Más de este repositorio

mismo repositorio

rerun-blueprint

rerun-io/trossen-oss

Design a Rerun blueprint from the data, then iterate on it from headless screenshots. Read this when laying out a recording or dataset in the viewer, designing a default blueprint, or deciding which views show which entities. Covers archetype-to-view mapping, layout reasoning, the rrb construction API, the contents grammar, and the screenshot loop.

2026-06-171

rerun-catalog-queries

rerun-io/trossen-oss

Performance patterns and gotchas for querying a Rerun catalog from Python. Reach for this when a CatalogClient/dataset query is unexpectedly slow, or when shaping a per-segment / per-episode pipeline that hits the catalog from many places.

2026-06-171

rerun-chunk-processing

rerun-io/trossen-oss

Core mechanics of the Rerun Chunk Processing API (rerun.experimental) — LazyChunkStream pipelines, Chunk, lenses (MutateLens/DeriveLens/Selector), RrdReader, writing optimized RRDs. Read when building or reviewing any ingestion, conversion, or RRD preprocessing pipeline. Source-specific knowledge lives in the importer skills (rerun-mcap, rerun-urdf, rerun-parquet, rerun-lerobot); read rerun-data-model first to decide what the data should become.

2026-06-171

rerun-lerobot

rerun-io/trossen-oss

Ingest a LeRobot (HuggingFace) dataset into Rerun. Read when converting a LeRobot dataset to RRDs, splitting it into per-episode segments, or registering it on a Rerun catalog. Covers the built-in directory importer (log_file_from_path), the RrdReader + send_chunks per-episode split, and when to drop to ParquetReader for custom control.

2026-06-171

rerun-mcap

rerun-io/trossen-oss

Ingest MCAP files into Rerun chunk streams with rerun.experimental.McapReader. Read when converting an MCAP recording, selecting topics or decoders, fixing protobuf schemas that ship without compiled descriptors, or when an MCAP-derived stream comes out empty. Builds on rerun-chunk-processing (stream mechanics) and rerun-data-model (what the topics should become).

2026-06-171

rerun-parquet

rerun-io/trossen-oss

Ingest tabular Parquet files into Rerun chunk streams with rerun.experimental.ParquetReader. Read when converting trajectory or sensor tables (LeRobot-style parquet, exported logs) into entities and components — column grouping, timeline/index columns, static columns, and ColumnRules that assemble typed components (Transform3D, Scalars) from flat columns. Builds on rerun-chunk-processing and rerun-data-model.

2026-06-171

name	rerun-data-model
description	How raw multimodal robot data maps onto the Rerun data model. Read FIRST, before modeling or converting a dataset. Resolves the entity-vs-component, property-vs-component-vs-layer, and static-vs-temporal decisions, then points at the mechanism skills (rerun-chunk-processing and the importer skills rerun-mcap, rerun-urdf, rerun-parquet, rerun-lerobot) for the how.
user_invocable	true
allowed-tools	Read, Grep, Bash, WebFetch

Rerun Data Model

Before writing conversion code, fill in the mapping table below. It is the design, and a human can review it in seconds.

The model

Dataset → Segment → Layer → Recording → Entity → Component → Chunk

Entity = a thing, named by a path (/robot/arm/camera). Component = one typed field on it (positions, image, a scalar). Archetype = a builder for a standard set of components the viewer knows how to render.
Every datum sits on two axes: where (entity path + component) and when (which timeline, or static = all time).
Segment = one episode. On registration recording_id becomes the segment_id. Layer = an extra .rrd on top of a segment; it attaches only by sharing the segment's recording_id. This one rule drives all layering.

The decisions

Property vs component vs layer (the most common confusion)

Component: per-timestamp signal on the timeline (joint angle, image).
Segment property: one-per-episode metadata for catalog filter/search (operator, robot, site, task, date). Not per timestamp.
Layer: a whole derived .rrd over a base segment (FK transforms, point clouds, gripper state, labels, quality scores). Queryable as if part of base.

Base vs layer Base = faithful conversion of the raw streams, nothing computed. Layer = anything derived (FK from URDF + joint states, clouds from depth + intrinsics). Keep them separate .rrds.

The mapping table (produce before coding)

Source (topic/column/key)	Entity path	Archetype	Component(s)	Timeline	Static/temporal	Base/layer	Property?
mcap `/joint_states`	`/robot/<joint>`	`Scalars`	`scalars`	`sensor_time`	temporal	base	no
`cam0/color.mp4`	`/camera/cam0/video`	`AssetVideo`+`VideoFrameReference`	asset+refs	`video_time`	asset static, refs temporal	base	no
`calibration.json`	`/camera/cam0`	`Pinhole` (+`Transform3D` extrinsics)	`image_from_camera`	—	static	base	no
URDF + joints (computed)	`/robot/<link>`	`Transform3D`	translation/rotation	`sensor_time`	temporal	layer	no
`episode.json` operator	segment	—	—	—	—	—	yes

Patterns worth knowing

Transforms / FK trees: log a Transform3D per link entity; it relates to the parent path and composes down the tree. (Named CoordinateFrame + child_frame/parent_frame only when topology must be a flexible graph.)
Cameras: extrinsics (Transform3D) + intrinsics (Pinhole) on the camera entity, image/depth as children so they inherit the projection. DepthImage needs DepthMeter (units-per-value).
Video: AssetVideo (MP4, log static) + VideoFrameReference for per-frame timing, or VideoStream for raw H.264/H.265 samples.
Columnar ingest: for existing datasets use send_columns, not a per-row log loop. It adds no automatic timelines, so pass every timeline you want.

Gotchas that cause real failures

Component columns come back as ListArray in queries: index [0]/[0][0] (0-based DataFrame, 1-based SQL). See rerun-catalog-queries.
A layer must share the segment's recording_id, or it won't attach. application_id is discarded on registration.
send_columns/send_chunks add no log_time/log_tick; only the timelines you pass exist.
Static shadows all temporal data of that component for all time; static is overwritten in the viewer but every write stays on disk until rerun rrd optimize.
One Transform3D/Pinhole relation per frame pair; logging the same relation on a second entity is rejected.
Entity paths are not file paths (.. is meaningless, __ is reserved).