| name | 3dgs-mcp-renderer |
| description | MCP protocol integration with 3DGS rendering pipeline: Agent-controlled Three.js/WebGPU rendering, voice-driven scene reconstruction, real-time parameter manipulation. Prototype for Agent↔3DGS interaction. |
| when_to_use | MCP rendering, agent-controlled 3DGS, voice-driven reconstruction, real-time 3DGS editing, Three.js 3DGS, WebGPU Gaussian splatting, interactive rendering control, speech-to-3D |
| version | 0.1.0 |
| author | jaccen |
| tags | ["mcp","3dgs","gaussian-splatting","rendering","three.js","webgpu","voice","agent","interactive"] |
| disable-model-invocation | true |
| user-invocable | true |
3DGS MCP Renderer — Agent-3DGS Interaction via MCP Protocol
Prototype specification for integrating MCP (Model Context Protocol) with 3DGS rendering pipelines, enabling AI Agents to directly manipulate Three.js/3DGS rendering parameters and achieve voice-driven 3D scene reconstruction.
Architecture
┌─────────────┐ ┌─────────────┐ ┌──────────────────┐ ┌────────────┐
│ Voice/Text │────▶│ Agent │────▶│ MCP Server │────▶│ 3DGS │
│ (Whisper/ │ │ (Claude/ │ │ (Node.js/ │ │ Renderer │
│ Prompt) │ │ TeleClaw) │ │ Python) │ │ (Three.js │
│ │◀────│ │◀────│ │◀────│ /WebGPU) │
└─────────────┘ └─────────────┘ └──────────────────┘ └────────────┘
│ │ │
│ Tool calls │ WebSocket/HTTP │ WebGL
│ (MCP protocol) │ transport │ Rendering
MCP Tools Specification
Tool 1: import_scene
{
"name": "import_scene",
"description": "Load a 3DGS scene from PLY/SPLAT file or URL into the renderer",
"inputSchema": {
"type": "object",
"properties": {
"source": { "type": "string", "description": "File path or URL to .ply/.splat file" },
"format": { "enum": ["ply", "splat", "spz", "ksplat"], "description": "File format" }
},
"required": ["source"]
},
"output": { "type": "object", "properties": { "scene_id": "string", "gaussian_count": "number", "bbox": "object" } }
}
Tool 2: set_camera
{
"name": "set_camera",
"description": "Set camera position, target, and field of view",
"inputSchema": {
"type": "object",
"properties": {
"position": { "type": "array", "items": {"type": "number"}, "description": "[x, y, z]" },
"target": { "type": "array", "items": {"type": "number"}, "description": "[x, y, z] look-at point" },
"fov": { "type": "number", "description": "Field of view in degrees" },
"up": { "type": "array", "items": {"type": "number"}, "description": "[x, y, z] up vector" }
},
"required": ["position", "target"]
}
}
Tool 3: modify_gaussians
{
"name": "modify_gaussians",
"description": "Modify properties of Gaussians by selection criteria",
"inputSchema": {
"type": "object",
"properties": {
"select": {
"type": "object",
"properties": {
"ids": { "type": "array", "items": {"type": "integer"}, "description": "Specific Gaussian IDs" },
"region": { "type": "object", "properties": {"center": "array", "radius": "number"}, "description": "Sphere selection" },
"label": { "type": "string", "description": "Semantic label from segmentation" }
}
},
"operations": {
"type": "array",
"items": {
"type": "object",
"properties": {
"property": { "enum": ["opacity", "color", "position", "scale", "rotation"] },
"action": { "enum": ["set", "add", "multiply"] },
"value": {}
}
}
}
},
"required": ["select", "operations"]
}
}
Tool 4: render_frame
{
"name": "render_frame",
"description": "Render current scene from current camera and return as image",
"inputSchema": {
"type": "object",
"properties": {
"width": { "type": "integer", "default": 1920 },
"height": { "type": "integer", "default": 1080 },
"format": { "enum": ["png", "jpeg", "webp"], "default": "png" },
"background": { "type": "string", "default": "#000000" }
}
},
"output": { "type": "object", "properties": { "image": "string (base64)", "render_time_ms": "number" } }
}
Tool 5: query_scene
{
"name": "query_scene",
"description": "Query scene information: statistics, geometry, semantics",
"inputSchema": {
"type": "object",
"properties": {
"query_type": { "enum": ["stats", "bbox", "gaussian_at_point", "segmentation", "materials"] },
"point": { "type": "array", "items": {"type": "number"}, "description": "[x, y, z] for point queries" }
},
"required": ["query_type"]
}
}
Voice-Driven Reconstruction Flow
User: "Show me the scene from above"
│
▼
Whisper STT ──▶ Text: "Show me the scene from above"
│
▼
Agent (Claude/TeleClaw) interprets:
- Intent: Change camera to bird's-eye view
- Parameters: position=[0, 10, 0], target=[0, 0, 0], up=[0, 0, -1]
│
▼
MCP tool call: set_camera(position=[0, 10, 0], target=[0, 0, 0])
│
▼
MCP tool call: render_frame(width=1920, height=1080)
│
▼
Agent receives base64 image, verifies, reports to user
User: "Make the left wall transparent"
│
▼
Agent:
1. query_scene(query_type="segmentation") → find "left wall" label
2. modify_gaussians(select={label: "left wall"}, operations=[{property: "opacity", action: "multiply", value: 0.2}])
3. render_frame() → verify visual result
Implementation Stack
| Component | Technology | Status |
|---|
| MCP Server | Node.js + @modelcontextprotocol/sdk | Prototype |
| 3DGS Renderer | Three.js + gaussian-splat-3d / gsplat.js | Available |
| WebGPU backend | WebGPU + WGSL compute shaders | Experimental |
| Transport | WebSocket (localhost) | Working |
| Voice STT | Whisper API / Web Speech API | Available |
| Agent integration | Claude Code / TeleClaw MCP client | Pending |
Current Renderer Compatibility
| Renderer | Format | WebGPU | MCP-Ready | Stars |
|---|
| gsplat.js | .ply/.splat | Yes | Needs adapter | — |
| GaussianSplats3D | .ply | WebGL | Needs adapter | — |
| viser/nerfstudio | .ply | WebGL | Partial | — |
| PlayCanvas | .ply | Yes | Needs adapter | — |
| brush (Rust/WebGPU) | .ply | Yes | Closest | 4.3k |
Known Limitations
- Latency: Large scenes (>1M Gaussians) require progressive loading; MCP render_frame may take 100-500ms
- Selection precision: Sphere/label-based Gaussian selection may miss thin structures; need ray-picking
- State management: MCP server must maintain scene state across tool calls; no built-in undo
- GPU memory: WebGL/WebGPU shares GPU memory with browser; cannot load >2GB scenes on most devices
Roadmap
Rules
- Never modify original PLY files: All operations are in-memory only; export requires explicit user command
- Validate before render: Always verify camera parameters and Gaussian bounds before rendering
- Respect GPU limits: Check available VRAM before loading large scenes; provide downsampling option
- Report rendering time: Always include render_time_ms in render_frame output for performance monitoring
- Safety gate: Operations affecting >10% of Gaussians require explicit user confirmation
Part of Awesome-Gaussian-Skills