| name | marimo |
| description | Create reactive Python notebooks for IMSA racing data analysis using marimo. Use for building interactive filtering UIs (seasons, classes, events), connecting to DuckDB databases, creating reactive visualizations, and performing data analysis with automatic cell re-execution. Includes templates, patterns, and IMSA-specific workflows. |
Marimo Notebooks for IMSA Data Analysis
Purpose
Create interactive, reactive marimo notebooks for analyzing IMSA racing data with:
- Interactive filtering UIs for seasons, classes, events, and sessions
- Reactive programming where cells auto-update when filters change
- DuckDB integration for SQL-based data analysis
- Visualization dashboards with Altair charts and tables
- Git-friendly notebooks stored as pure Python files
Key Concepts
Reactive Execution
marimo automatically runs dependent cells when variables change. When you interact with a UI element (e.g., change a dropdown), all cells referencing that element automatically re-run with the new value.
season = mo.ui.dropdown(options=[2023, 2024, 2025], value=2025)
data = load_season_data(season.value)
chart = create_visualization(data)
Critical Rules
- Assign UI elements to global variables - marimo can only synchronize elements assigned to global variables
- Reference the
.value attribute - Access UI element values via element.value
- No hidden state - Delete a cell and marimo deletes its variables
- Pure Python files - Notebooks are
.py files, not JSON
Quick Start
Directory
Work in ./outputs/reports folder, create it if necissary
Installation
cd ./output/reports && uv init
cd ./output/reports && uv add marimo openai pandas ...
Create New Notebook
cd ./output/reports && uv run marimo edit imsa_analysis.py
Use Template
Copy the template from assets/imsa_analysis_template.py as a starting point for IMSA analysis.
Building IMSA Analysis Notebooks
Step 1: Import and Connect
import marimo as mo
@app.cell
def __():
import marimo as mo
import duckdb
import pandas as pd
import altair as alt
return alt, duckdb, mo, pd
@app.cell
def __(duckdb):
conn = duckdb.connect("../imsa.duckdb", read_only=True)
return conn,
Step 2: Load Filter Options
@app.cell
def __(conn):
seasons_df = conn.execute("""
SELECT DISTINCT season FROM seasons
WHERE session = 'race'
ORDER BY season DESC
""").df()
events_df = conn.execute("""
SELECT DISTINCT event FROM seasons
WHERE session = 'race'
ORDER BY event
""").df()
classes_df = conn.execute("""
SELECT DISTINCT class FROM laps
WHERE class IS NOT NULL
ORDER BY class
""").df()
return seasons_df, events_df, classes_df
Step 3: Create Interactive Filters
@app.cell
def __(mo, seasons_df, events_df, classes_df):
season_filter = mo.ui.dropdown(
options=seasons_df['season'].tolist(),
value=seasons_df['season'].iloc[0],
label="Season"
)
event_filter = mo.ui.dropdown(
options=events_df['event'].tolist(),
value=events_df['event'].iloc[0],
label="Event"
)
class_filter = mo.ui.dropdown(
options=classes_df['class'].tolist(),
value=classes_df['class'].iloc[0],
label="Class"
)
mo.hstack([
mo.vstack([mo.md("**Season**"), season_filter]),
mo.vstack([mo.md("**Event**"), event_filter]),
mo.vstack([mo.md("**Class**"), class_filter])
], justify="start")
return season_filter, event_filter, class_filter
Step 4: Get Session ID (Critical for IMSA)
@app.cell
def __(conn, season_filter, event_filter, mo):
session_query = f"""
SELECT session_id, start_date, session
FROM seasons
WHERE season = {season_filter.value}
AND event = '{event_filter.value}'
AND session = 'race'
LIMIT 1
"""
session_result = conn.execute(session_query).df()
if len(session_result) > 0:
session_id = session_result['session_id'].iloc[0]
mo.md(f"**Session ID**: {session_id}")
else:
mo.md("⚠️ No race session found")
session_id = None
return session_id, session_result
Step 5: Query and Analyze Data
@app.cell
def __(conn, session_id, class_filter, mo):
if session_id:
query = f"""
SELECT
driver_name,
car_number,
lap_time,
lap_number,
bpillar_quartile
FROM laps
WHERE session_id = {session_id}
AND class = '{class_filter.value}'
AND bpillar_quartile IN (1, 2) -- Top 50% clean laps
AND flags = 'GF' -- Green flag only
ORDER BY lap_time
LIMIT 100
"""
laps_df = conn.execute(query).df()
if len(laps_df) > 0:
mo.md(f"Loaded {len(laps_df)} laps")
else:
mo.md("⚠️ No data found")
else:
laps_df = None
return laps_df,
Step 6: Visualize Results
@app.cell
def __(laps_df, alt, mo):
if laps_df is not None and len(laps_df) > 0:
chart = alt.Chart(laps_df).mark_boxplot().encode(
x=alt.X('driver_name:N', title='Driver', sort='-y'),
y=alt.Y('lap_time:Q', title='Lap Time (seconds)'),
color='driver_name:N'
).properties(
width=800,
height=400,
title='Lap Time Distribution'
)
chart
else:
mo.md("No data to visualize")
return chart,
IMSA-Specific Requirements
Always Use session_id
WHERE session_id = {session_id}
AND class = '{class_filter.value}'
WHERE season = 2025
Always Default to Race Sessions
WHERE session = 'race'
WHERE session = 'practice'
Always Use BPillar Filtering for Races
WHERE bpillar_quartile IN (1, 2) -- Top 50% of clean laps
AND flags = 'GF' -- Green flag
-- No quartile filtering
Never Compare Across Classes
WHERE session_id = {session_id} AND class = 'GTP'
WHERE session_id = {session_id} -- Missing class filter
Common UI Patterns
Multi-Select for Classes
class_selector = mo.ui.multiselect(
options=["GTP", "LMP2", "GTD"],
value=["GTP"],
label="Select Classes"
)
classes_str = "', '".join(class_selector.value)
query = f"WHERE class IN ('{classes_str}')"
Slider for Top N Selection
top_n = mo.ui.slider(
start=5,
stop=50,
step=5,
value=10,
label="Top N Drivers",
show_value=True
)
query = f"... ORDER BY lap_time LIMIT {top_n.value}"
Form for Expensive Operations
analysis_params = mo.md("""
### Configure Analysis
- Minimum laps: {min_laps}
- Include practice: {include_practice}
""").batch(
min_laps=mo.ui.slider(5, 50, value=10),
include_practice=mo.ui.checkbox(value=False)
).form()
if analysis_params.value:
results = run_expensive_analysis(**analysis_params.value)
Conditional Display with mo.stop
mo.stop(
session_id is None,
mo.md("⚠️ Please select a valid session")
)
mo.stop(
laps_df is None or len(laps_df) == 0,
mo.md("⚠️ No data available for analysis")
)
Performance Optimization
Filter in SQL, Not Python
query = f"""
SELECT * FROM laps
WHERE session_id = {session_id}
AND class = '{class}'
AND bpillar_quartile IN (1, 2)
LIMIT 1000
"""
df = conn.execute(query).df()
df = conn.execute("SELECT * FROM laps").df()
df = df[df['session_id'] == session_id]
Limit Data for Visualization
query = f"""
SELECT * FROM laps
WHERE session_id = {session_id}
ORDER BY RANDOM()
LIMIT 1000 -- Sample for performance
"""
Visualization Patterns
Interactive Altair Charts
selection = alt.selection_point(fields=['driver_name'])
chart = alt.Chart(laps_df).mark_circle().encode(
x='lap_number:Q',
y='lap_time:Q',
color=alt.condition(selection, 'driver_name:N', alt.value('lightgray')),
tooltip=['driver_name', 'lap_time', 'lap_number']
).add_params(selection).properties(
width=800,
height=400
)
Interactive Tables
table = mo.ui.table(
laps_df,
selection='multi',
page_size=20
)
table
selected_rows = table.value
Dashboard Layout
tabs = mo.ui.tabs({
"Lap Times": lap_time_analysis,
"Consistency": consistency_metrics,
"Pit Strategy": pit_analysis,
"Weather Impact": weather_correlation
})
tabs
Reference Materials
Detailed Patterns
See references/marimo_patterns.md for comprehensive patterns including:
- Advanced UI element configurations
- DuckDB integration techniques
- Reactive design patterns
- Data visualization examples
- IMSA-specific query patterns
- Performance optimization tips
Deeper docs
See references/agent-docs.md are officially written docs for running a marimo agent. It's excellent and you should read it.
Template Notebook
See assets/imsa_analysis_template.py for a complete working example that demonstrates:
- Database connection setup
- Interactive filter creation
- Session ID retrieval
- Proper IMSA data filtering
- Visualization and analysis
- Error handling
Running Notebooks
Development Mode
cd ./output/reports && uv run marimo edit notebook.py
App Mode
cd ./output/reports && uv run marimo run notebook.py
cd ./output/reports && uv run marimo run --include-code notebook.py
cd ./output/reports && uv run marimo run --port 8080 notebook.py
Script Mode
cd ./output/reports && uv run python notebook.py
cd ./output/reports && uv run python notebook.py --season 2025 --event Sebring
Best Practices
1. One Cell, One Purpose
Each cell should have a single, clear purpose:
- Cell 1: Imports
- Cell 2: Database connection
- Cell 3: Load filter options
- Cell 4: Create UI filters
- Cell 5: Get session_id
- Cell 6: Query data
- Cell 7: Visualize
2. Validate Inputs
Always check if data exists before processing:
if session_id and laps_df is not None and len(laps_df) > 0:
else:
mo.md("⚠️ No data available")
3. Use Descriptive Variable Names
season_filter = mo.ui.dropdown(...)
laps_df = conn.execute(query).df()
s = mo.ui.dropdown(...)
df = conn.execute(query).df()
4. Add User Feedback
with mo.status.spinner(title="Loading data..."):
df = load_data()
mo.md(f"✅ Loaded {len(df)} laps")
if len(df) < 100:
mo.md("⚠️ Small sample size")
5. Document Complex Queries
query = f"""
SELECT * FROM laps
WHERE session_id = {session_id} -- Single session for comparison
AND class = '{class}' -- Each class analyzed separately
AND bpillar_quartile IN (1, 2) -- Top 50% clean laps (excludes pit stops)
AND flags = 'GF' -- Green flag laps only
"""
Troubleshooting
UI Element Not Updating Other Cells
- Cause: Element not assigned to global variable
- Fix: Ensure
element = mo.ui.dropdown(...) at module level
Cell Not Re-Running on Change
- Cause: Cell doesn't reference the changed variable
- Fix: Check that cell reads
element.value somewhere
"No module named marimo"
- Cause: marimo not installed in current environment
- Fix:
pip install "marimo[sql]"
Database Connection Error
- Cause: Wrong path or database doesn't exist
- Fix: Check path in
duckdb.connect("path/to/imsa.duckdb")
Empty Results
- Cause: Filters too restrictive or no data for selection
- Fix: Check if session exists, validate filter values