一键在 Manus 中运行任何 Skill

sandbox

星标19

分支0

更新时间2026年5月14日 09:19

Launch an interactive shell inside a microsandbox for debugging. Supports bare mode, executor setup, or judge setup with optional test case scaffolding.

安装

用 Codex 或 Claude 帮你安装复制这段 Prompt，粘贴到 Codex、Claude 或其他助手里，让它检查 Skill 页面并帮你完成安装。

在 Manus 中运行

来源

PSPDFKit-labs

PSPDFKit-labs/agentic-usability

打开 GitHub 仓库查看创作者相关仓库

下载

在 Manus 中运行

Debug Sandbox

Launch an interactive shell inside a microsandbox identical to what the pipeline uses. Useful for debugging agent auth, inspecting environment variables, testing commands, and reproducing sandbox issues.

echo "Arguments: $ARGUMENTS"

Modes

By default the sandbox boots with just the target image, secrets, and env vars — no agent install or workspace setup.

Bare (no flags)

agentic-usability sandbox -p <project>

Boots a sandbox with the configured secrets and env vars. Nothing else is installed or scaffolded.

Executor mode

agentic-usability sandbox -p <project> --mode executor
agentic-usability sandbox -p <project> --mode executor --test TC-001

Installs the executor agent CLI. With --test, also scaffolds the workspace, uploads PROBLEM.md, and uploads public sources — mirroring the execute stage setup.

Judge mode

agentic-usability sandbox -p <project> --mode judge --test TC-001
agentic-usability sandbox -p <project> --mode judge --test TC-001 --run <runId>

Installs the judge agent CLI. With --test, restores the workspace snapshot from a previous run (or uploads solution files), uploads all sources (private + public) — mirroring the judge stage setup.

Options

Flag	Default	Description
`--target <name>`	first in config	Which target image to use
`--mode <mode>`	(none)	`executor` or `judge` — installs agent CLI and optionally sets up workspace
`--test <id>`	(none)	Test case to scaffold (requires `--mode`)
`--run <runId>`	latest	Run to load workspace snapshot from (judge mode)
`--output <dir>`	`results/sandbox-debug-<timestamp>/`	Directory to save debug artifacts

Interactive Shell

Once inside the sandbox, you have a full shell. Press Ctrl-] to detach and destroy the sandbox.

Common debugging tasks:

printenv | grep KEY — check which env vars are set
codex login --with-api-key — test Codex auth
cat /workspace/PROBLEM.md — verify problem statement
ls /workspace/sources/ — check uploaded sources

Artifacts

After detaching, the following artifacts are saved to the output directory:

File	Description
`agent-egress.log.json`	Network traffic captured during the session
`setup.log`	Scaffolding and agent install output
`workspace-snapshot.tar.gz`	Tarball of `/workspace` after session ends
`agent-session.jsonl`	Agent CLI session log (if available)

Run agentic-usability sandbox -p $ARGUMENTS and report the results.

同仓库更多 Skills

同仓库

init

PSPDFKit-labs/agentic-usability

Initialize a new agentic-usability benchmark pipeline project. Use when setting up a new SDK benchmark, creating a config.json, or starting a new evaluation project.

2026-05-1419

eval

PSPDFKit-labs/agentic-usability

Run the full evaluation pipeline (execute, judge, report) for an SDK usability benchmark. Use when running a complete benchmark end-to-end, resuming an interrupted pipeline, or checking pipeline status.

2026-04-2719

execute

PSPDFKit-labs/agentic-usability

Execute benchmark test cases in sandboxed environments with AI agents. Spins up microsandbox containers for each test case and extracts solutions.

2026-04-2719

export

PSPDFKit-labs/agentic-usability

Export a benchmark pipeline as a zip file for sharing or archiving. Excludes cache and large snapshots.

2026-04-2719

generate

PSPDFKit-labs/agentic-usability

Generate SDK usability test cases by exploring source code. Use when creating benchmark test suites, generating test cases for an SDK, or when the user wants to create evaluation scenarios.

2026-04-2719

insights

PSPDFKit-labs/agentic-usability

Analyze benchmark results and identify SDK improvement areas. Use when reviewing evaluation results, finding failure patterns, identifying documentation gaps, or understanding API design issues.

2026-04-2719

name	sandbox
description	Launch an interactive shell inside a microsandbox for debugging. Supports bare mode, executor setup, or judge setup with optional test case scaffolding.
argument-hint	[project-directory] [--mode executor\|judge] [--test TC-001] [--target node-20] [--run runId]
disable-model-invocation	true
allowed-tools	Bash(agentic-usability *) Read Glob

Debug Sandbox

echo "Arguments: $ARGUMENTS"

Modes

By default the sandbox boots with just the target image, secrets, and env vars — no agent install or workspace setup.

Bare (no flags)

agentic-usability sandbox -p <project>

Boots a sandbox with the configured secrets and env vars. Nothing else is installed or scaffolded.

Executor mode

agentic-usability sandbox -p <project> --mode executor
agentic-usability sandbox -p <project> --mode executor --test TC-001

Installs the executor agent CLI. With --test, also scaffolds the workspace, uploads PROBLEM.md, and uploads public sources — mirroring the execute stage setup.

Judge mode

agentic-usability sandbox -p <project> --mode judge --test TC-001
agentic-usability sandbox -p <project> --mode judge --test TC-001 --run <runId>

Options

Flag	Default	Description
`--target <name>`	first in config	Which target image to use
`--mode <mode>`	(none)	`executor` or `judge` — installs agent CLI and optionally sets up workspace
`--test <id>`	(none)	Test case to scaffold (requires `--mode`)
`--run <runId>`	latest	Run to load workspace snapshot from (judge mode)
`--output <dir>`	`results/sandbox-debug-<timestamp>/`	Directory to save debug artifacts

Interactive Shell

Once inside the sandbox, you have a full shell. Press Ctrl-] to detach and destroy the sandbox.

Common debugging tasks:

printenv | grep KEY — check which env vars are set
codex login --with-api-key — test Codex auth
cat /workspace/PROBLEM.md — verify problem statement
ls /workspace/sources/ — check uploaded sources

Artifacts

After detaching, the following artifacts are saved to the output directory:

File	Description
`agent-egress.log.json`	Network traffic captured during the session
`setup.log`	Scaffolding and agent install output
`workspace-snapshot.tar.gz`	Tarball of `/workspace` after session ends
`agent-session.jsonl`	Agent CLI session log (if available)

Run agentic-usability sandbox -p $ARGUMENTS and report the results.