| name | abotclaw-progress-critic |
| description | Use a deployed VLAC-style vision-language-action critic service to evaluate task progress, compare current observations against a reference image, and judge task completion from robot camera frames. Use when the agent needs external progress supervision, completion verification, failure detection, or image-based task-state comparison for Piper, Unitree G1, or Unitree Go2. |
AbotClaw Progress Critic
Use this skill when a robot task needs an external judge instead of relying only on hand-written heuristics.
This skill is about using an already deployed critic service, not deploying the service.
What This Service Does
The VLAC critic can compare:
- a current frame
- a reference frame
- a task description
and return progress or completion-related judgment.
This is useful for:
- task progress estimation
- task completion verification
- detecting failed or unchanged task state
- deciding whether the robot should continue, retry, or stop
Service Contract
For the FastAPI service in this stack:
- endpoint:
POST /critic
- required inputs:
image
reference_image
task_description
Important:
image and reference_image must be sent in the same request
- there is no separate reference-image cache/upload endpoint
Minimal Request Shape
{
"image": "<base64_or_url_or_path>",
"reference_image": "<base64_or_url_or_path>",
"task_description": "..."
}
Responsibility Boundary
This skill owns:
- when to use the critic
- how to call
/critic
- how to prepare current image + reference image + task description
- how to use critic results to decide continue / retry / stop
This skill does not own robot SDK discovery. Use abotclaw-sdk-discovery to learn how each robot provides camera frames.
When to Use It
Use the critic when:
- the robot needs a visual completion check
- hand-authored success conditions are unreliable
- the user asks "is it done?" or "did that succeed?"
- a task needs step-wise supervision from images
- you want to compare current state against a known target state
Standard Workflow
- Use
abotclaw-sdk-discovery to learn how the target robot exposes camera frames.
- Capture a current frame from Piper, G1, or Go2.
- Obtain a reference image that represents the desired or comparison state.
- Write a task description that matches the intended goal.
- Call the critic service with all three inputs in one request.
- Interpret the response as supervision for task control.
Current Image Sources
The current observation can come from any robot in the fleet:
- Piper camera
- G1 camera
- Go2 camera
Choose the camera that best reflects task progress.
Examples:
- Piper wrist or workcell camera for tabletop manipulation
- G1 head or chest camera for humanoid interaction tasks
- Go2 forward camera for inspection or navigation-adjacent tasks
Reference Image Sources
A reference image may come from:
- a successful prior run
- a user-provided target image
- a recorded frame from a known-good final state
- a memory/evidence image from
abotclaw-memory
Example Request
Get the correct host and port from service.md.
curl -s -X POST <VLAC_BASE_URL>/critic \
-H 'Content-Type: application/json' \
-d '{
"image":"<current_frame>",
"reference_image":"<reference_frame>",
"task_description":"Put the bowl back into the white storage box."
}'
Input Preparation Rules
image
Use the robot's current frame.
reference_image
Use an image that represents the expected target state or a meaningful comparison state.
task_description
Keep it concrete and visual.
Better:
- "Put the bowl back into the white storage box."
- "Place the bottle upright on the tray."
Worse:
- "Do the task correctly."
- "Finish it."
How to Use the Result
Treat the critic output as task supervision, not absolute truth.
Possible uses:
- if the critic indicates strong completion -> stop or hand back success
- if the critic indicates partial progress -> continue
- if the critic indicates failure or no change -> retry, replan, or ask for help
- if the critic disagrees with sensor heuristics -> inspect evidence before acting further
Multi-Robot Usage Pattern
The image source does not have to come from the same robot that originally planned the task.
Examples:
- Go2 scouts a scene, then the critic evaluates whether the inspected target matches expectation
- Piper manipulates an object, and its current frame is checked against a reference finish state
- G1 performs a human-environment interaction task, and the critic judges completion from G1's current view
Integration with Memory
This skill works well with abotclaw-memory:
- use memory to retrieve a prior successful evidence image as
reference_image
- use the critic to compare the current scene against remembered success state
- use critic output plus memory result to decide whether to navigate, manipulate, or stop
Behavioral Rule
Use the critic to reduce ambiguity in real-world execution, especially when success is easier to see than to hand-code.
Do not pretend the critic is a robot controller. It is a supervisor and evaluator.