원클릭으로
export
Export a benchmark pipeline as a zip file for sharing or archiving. Excludes cache and large snapshots.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
메뉴
Export a benchmark pipeline as a zip file for sharing or archiving. Excludes cache and large snapshots.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
SOC 직업 분류 기준
Initialize a new agentic-usability benchmark pipeline project. Use when setting up a new SDK benchmark, creating a config.json, or starting a new evaluation project.
Launch an interactive shell inside a microsandbox for debugging. Supports bare mode, executor setup, or judge setup with optional test case scaffolding.
Run the full evaluation pipeline (execute, judge, report) for an SDK usability benchmark. Use when running a complete benchmark end-to-end, resuming an interrupted pipeline, or checking pipeline status.
Execute benchmark test cases in sandboxed environments with AI agents. Spins up microsandbox containers for each test case and extracts solutions.
Generate SDK usability test cases by exploring source code. Use when creating benchmark test suites, generating test cases for an SDK, or when the user wants to create evaluation scenarios.
Analyze benchmark results and identify SDK improvement areas. Use when reviewing evaluation results, finding failure patterns, identifying documentation gaps, or understanding API design issues.
| name | export |
| description | Export a benchmark pipeline as a zip file for sharing or archiving. Excludes cache and large snapshots. |
| argument-hint | [project-directory] [-o output.zip] [-r runId] |
| disable-model-invocation | true |
| allowed-tools | Bash(agentic-usability *) |
Export the pipeline project as a zip archive for sharing or archiving.
echo "Arguments: $ARGUMENTS"
-o, --output <path>: Output zip file path (default: <pipeline-name>-export.zip)-r, --run <runId>: Export only a specific run instead of the entire projectThe zip includes:
config.json — pipeline configurationsuite.json — test suiteresults/ — all run results (judge scores, solutions, logs)cache/** — git repo clones (can be re-fetched)**/*.tar.gz — workspace snapshots (large binary files)<project>/
config.json
suite.json
results/<runId>/
run.json # Run manifest
pipeline-state.json # Pipeline state
report.json # Scorecard
<target>/<testId>/ # Per-test results
Run agentic-usability export -p $ARGUMENTS.
For the full file inventory, see pipeline-guide.md.