تشغيل أي مهارة في Manus بنقرة واحدة

sandbox

النجوم١٩

التفرعات٠

آخر تحديث١٤ مايو ٢٠٢٦ في ٠٩:١٩

Launch an interactive shell inside a microsandbox for debugging. Supports bare mode, executor setup, or judge setup with optional test case scaffolding.

التثبيت

التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.

تشغيل في Manus

المصدر

PSPDFKit-labs

PSPDFKit-labs/agentic-usability

فتح مستودع GitHub عرض مستودعات المنشئ

تنزيل

تشغيل في Manus

المهن ذات الصلةSOC

استنادا إلى تصنيف SOC المهني

مديرو الشبكات وأنظمة الحاسوبمهن الحاسوب والرياضيات·SOC 15-1244

SKILL.md

readonly

name	sandbox
description	Launch an interactive shell inside a microsandbox for debugging. Supports bare mode, executor setup, or judge setup with optional test case scaffolding.
argument-hint	[project-directory] [--mode executor\|judge] [--test TC-001] [--target node-20] [--run runId]
disable-model-invocation	true
allowed-tools	Bash(agentic-usability *) Read Glob

Debug Sandbox

Launch an interactive shell inside a microsandbox identical to what the pipeline uses. Useful for debugging agent auth, inspecting environment variables, testing commands, and reproducing sandbox issues.

echo "Arguments: $ARGUMENTS"

Modes

By default the sandbox boots with just the target image, secrets, and env vars — no agent install or workspace setup.

Bare (no flags)

agentic-usability sandbox -p <project>

Boots a sandbox with the configured secrets and env vars. Nothing else is installed or scaffolded.

Executor mode

agentic-usability sandbox -p <project> --mode executor
agentic-usability sandbox -p <project> --mode executor --test TC-001

Installs the executor agent CLI. With --test, also scaffolds the workspace, uploads PROBLEM.md, and uploads public sources — mirroring the execute stage setup.

Judge mode

agentic-usability sandbox -p <project> --mode judge --test TC-001
agentic-usability sandbox -p <project> --mode judge --test TC-001 --run <runId>

Installs the judge agent CLI. With --test, restores the workspace snapshot from a previous run (or uploads solution files), uploads all sources (private + public) — mirroring the judge stage setup.

Options

Flag	Default	Description
`--target <name>`	first in config	Which target image to use
`--mode <mode>`	(none)	`executor` or `judge` — installs agent CLI and optionally sets up workspace
`--test <id>`	(none)	Test case to scaffold (requires `--mode`)
`--run <runId>`	latest	Run to load workspace snapshot from (judge mode)
`--output <dir>`	`results/sandbox-debug-<timestamp>/`	Directory to save debug artifacts

Interactive Shell

Once inside the sandbox, you have a full shell. Press Ctrl-] to detach and destroy the sandbox.

Common debugging tasks:

printenv | grep KEY — check which env vars are set
codex login --with-api-key — test Codex auth
cat /workspace/PROBLEM.md — verify problem statement
ls /workspace/sources/ — check uploaded sources

Artifacts

After detaching, the following artifacts are saved to the output directory:

File	Description
`agent-egress.log.json`	Network traffic captured during the session
`setup.log`	Scaffolding and agent install output
`workspace-snapshot.tar.gz`	Tarball of `/workspace` after session ends
`agent-session.jsonl`	Agent CLI session log (if available)

Run agentic-usability sandbox -p $ARGUMENTS and report the results.

المزيد من هذا المستودع

نفس المستودع

init

PSPDFKit-labs/agentic-usability

Initialize a new agentic-usability benchmark pipeline project. Use when setting up a new SDK benchmark, creating a config.json, or starting a new evaluation project.

2026-05-1419

eval

PSPDFKit-labs/agentic-usability

Run the full evaluation pipeline (execute, judge, report) for an SDK usability benchmark. Use when running a complete benchmark end-to-end, resuming an interrupted pipeline, or checking pipeline status.

2026-04-2719

execute

PSPDFKit-labs/agentic-usability

Execute benchmark test cases in sandboxed environments with AI agents. Spins up microsandbox containers for each test case and extracts solutions.

2026-04-2719

export

PSPDFKit-labs/agentic-usability

Export a benchmark pipeline as a zip file for sharing or archiving. Excludes cache and large snapshots.

2026-04-2719

generate

PSPDFKit-labs/agentic-usability

Generate SDK usability test cases by exploring source code. Use when creating benchmark test suites, generating test cases for an SDK, or when the user wants to create evaluation scenarios.

2026-04-2719

insights

PSPDFKit-labs/agentic-usability

Analyze benchmark results and identify SDK improvement areas. Use when reviewing evaluation results, finding failure patterns, identifying documentation gaps, or understanding API design issues.

2026-04-2719

name	sandbox
description	Launch an interactive shell inside a microsandbox for debugging. Supports bare mode, executor setup, or judge setup with optional test case scaffolding.
argument-hint	[project-directory] [--mode executor\|judge] [--test TC-001] [--target node-20] [--run runId]
disable-model-invocation	true
allowed-tools	Bash(agentic-usability *) Read Glob

Debug Sandbox

echo "Arguments: $ARGUMENTS"

Modes

By default the sandbox boots with just the target image, secrets, and env vars — no agent install or workspace setup.

Bare (no flags)

agentic-usability sandbox -p <project>

Boots a sandbox with the configured secrets and env vars. Nothing else is installed or scaffolded.

Executor mode

agentic-usability sandbox -p <project> --mode executor
agentic-usability sandbox -p <project> --mode executor --test TC-001

Installs the executor agent CLI. With --test, also scaffolds the workspace, uploads PROBLEM.md, and uploads public sources — mirroring the execute stage setup.

Judge mode

agentic-usability sandbox -p <project> --mode judge --test TC-001
agentic-usability sandbox -p <project> --mode judge --test TC-001 --run <runId>

Options

Flag	Default	Description
`--target <name>`	first in config	Which target image to use
`--mode <mode>`	(none)	`executor` or `judge` — installs agent CLI and optionally sets up workspace
`--test <id>`	(none)	Test case to scaffold (requires `--mode`)
`--run <runId>`	latest	Run to load workspace snapshot from (judge mode)
`--output <dir>`	`results/sandbox-debug-<timestamp>/`	Directory to save debug artifacts

Interactive Shell

Once inside the sandbox, you have a full shell. Press Ctrl-] to detach and destroy the sandbox.

Common debugging tasks:

printenv | grep KEY — check which env vars are set
codex login --with-api-key — test Codex auth
cat /workspace/PROBLEM.md — verify problem statement
ls /workspace/sources/ — check uploaded sources

Artifacts

After detaching, the following artifacts are saved to the output directory:

File	Description
`agent-egress.log.json`	Network traffic captured during the session
`setup.log`	Scaffolding and agent install output
`workspace-snapshot.tar.gz`	Tarball of `/workspace` after session ends
`agent-session.jsonl`	Agent CLI session log (if available)

Run agentic-usability sandbox -p $ARGUMENTS and report the results.