Run any Skill in Manus with one click

Get Started

$pwd:

update-rubric

Name: Update Rubric
Author: harbor-framework

// Propose new rubric criteria based on review findings — creates a branch and PR

Run Skill in Manus

$ git log --oneline --stat

stars:208

forks:260

updated:April 10, 2026 at 18:38

SKILL.md

readonly

name	update-rubric
description	Propose new rubric criteria based on review findings — creates a branch and PR
allowed-tools	Bash, Read, Write, Edit, Agent
argument-hint	<criteria-description>

Propose new criteria for the task implementation rubric. This is typically invoked after /review-task identifies general patterns that should be caught for all future tasks.

Run this from within your local clone of the benchmark repo.

Step 1: Read Current Rubric

Detect the repo root and read the current rubric to understand the format:

REPO_ROOT=$(git rev-parse --show-toplevel)
cat "$REPO_ROOT/rubrics/task-implementation.toml"

Each criterion follows this TOML format:

[[criteria]]
name = "criterion_name"
description = "One-line description"
guidance = """
Multi-line guidance explaining what to check.

PASS if ... FAIL if ..."""

Step 2: Accept Criteria Descriptions

Parse $ARGUMENTS for the criteria to add. If invoked from /review-task, the rubric improvement candidates will be described in the review summary. If invoked standalone, the user provides a description of what to add.

For each candidate, draft a new [[criteria]] entry matching the existing format:

name: snake_case, concise
description: one-line summary
guidance: detailed explanation with PASS/FAIL conditions

Present the drafted criteria to the user for review before proceeding.

Step 3: Create Branch and Apply Changes

Detect the GitHub remote and create a branch:

REPO_ROOT=$(git rev-parse --show-toplevel)
GITHUB_REPO=$(git remote get-url origin | sed 's|.*github.com[:/]\(.*\)\.git$|\1|; s|.*github.com[:/]\(.*\)$|\1|')
BRANCH="rubric/add-criteria-$(date +%Y%m%d)"

cd "$REPO_ROOT"
git checkout -b "$BRANCH"

Append the new [[criteria]] entries to rubrics/task-implementation.toml.

Step 4: Commit, Push, and Create PR

cd "$REPO_ROOT"
git add rubrics/task-implementation.toml
git commit -m "Add rubric criteria: <brief description>"
git push -u origin "$BRANCH"
gh pr create --repo "$GITHUB_REPO" \
  --title "Add rubric criteria: <brief description>" \
  --body "$(cat <<'EOF'
## Summary
- Adds new criteria to `rubrics/task-implementation.toml` based on patterns observed during task review

## New Criteria
<list each new criterion name and one-line description>

## Motivation
<which review or PR prompted this, what pattern was observed>

## Test plan
- [ ] Run `harbor check` against existing tasks to verify new criteria don't cause false positives
- [ ] Verify TOML syntax is valid

Generated with [Claude Code](https://claude.ai/claude-code)
EOF
)"

Open the PR URL in the browser and report it to the user.

Step 5: Propagate to Downstream Repos (remind user)

After the PR is merged, remind the user to propagate to downstream repos:

git fetch template
git merge template/main
git push origin main

And switch back to the main branch:

git checkout main
git branch -d "$BRANCH"

related-skills.json

same repository

convert-separate-verifier.md

from "harbor-framework/terminal-bench-3"

Convert a TB3 task from Harbor's shared verifier mode (default) to separate verifier mode. Use when the user asks to "convert this task to separate verifier", "make the verifier run in its own container", or asks about Harbor's separate-verifier environment for a specific task.

2026-05-30208

trial-costs.md

from "harbor-framework/terminal-bench-3"

Sum agent-trial spend ($) from TB3 PR comments and Modal compute spend. Reports totals by kind (/run vs /cheat) and provider (anthropic, openai, ...), top-spender PRs, per-task latest trial cost, and daily Modal billing. Use when the user asks how much was spent on agent trials, cost breakdown by provider, PR-level spend, or Modal usage.

2026-05-16208

deep-review-task.md

from "harbor-framework/terminal-bench-3"

Run the sandboxed deep-review tool against a benchmark task PR — launches Claude Code in Docker with pre-fetched PR artifacts and writes review-summary.md / issues-found.md

2026-05-16208

tb3-status.md

from "harbor-framework/terminal-bench-3"

Generate TB3 review status report and open in browser

2026-05-13208

package.json

"author": "harbor-framework"

"repository": "harbor-framework/terminal-bench-3"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations15-1253L4

cd "$REPO_ROOT" git add rubrics/task-implementation.toml git commit -m "Add rubric criteria: <brief description>" git push -u origin "$BRANCH" gh pr create --repo "$GITHUB_REPO" \ --title "Add rubric criteria: <brief description>" \ --body "$(cat <<'EOF' ## Summary - Adds new criteria to `rubrics/task-implementation.toml` based on patterns observed during task review ## New Criteria <list each new criterion name and one-line description> ## Motivation <which review or PR prompted this, what pattern was observed> ## Test plan - [ ] Run `harbor check` against existing tasks to verify new criteria don't cause false positives - [ ] Verify TOML syntax is valid Generated with [Claude Code](https://claude.ai/claude-code) EOF )"

update-rubric

Step 1: Read Current Rubric

Step 2: Accept Criteria Descriptions

Step 3: Create Branch and Apply Changes

Step 4: Commit, Push, and Create PR

Step 5: Propagate to Downstream Repos (remind user)

More from this repository

More from this repository

Step 1: Read Current Rubric

Step 2: Accept Criteria Descriptions

Step 3: Create Branch and Apply Changes

Step 4: Commit, Push, and Create PR

Step 5: Propagate to Downstream Repos (remind user)