Execute qualquer Skill no Manus
com um clique

Execute qualquer Skill no Manus com um clique

Repositório GitHub

agentic-usability

agentic-usability contém 10 skills coletadas de PSPDFKit-labs, com cobertura ocupacional por repositório e páginas de detalhe dentro do site.

Perfil de PSPDFKit-labs Ver no GitHub

skills coletadas

Stars

atualizado

2026-05-14

Forks

Cobertura ocupacional

Analistas de garantia de qualidade de software e testadores Desenvolvedores de software Administradores de redes e sistemas de computador

3 categorias ocupacionais · 100% classificado

explorador de repositórios

Skills neste repositório

criador/repositório/skill

skill

ocupação

descrição

atualizado

init

Desenvolvedores de software

Initialize a new agentic-usability benchmark pipeline project. Use when setting up a new SDK benchmark, creating a config.json, or starting a new evaluation project.

2026-05-14

sandbox

Administradores de redes e sistemas de computador

Launch an interactive shell inside a microsandbox for debugging. Supports bare mode, executor setup, or judge setup with optional test case scaffolding.

2026-05-14

eval

Analistas de garantia de qualidade de software e testadores

Run the full evaluation pipeline (execute, judge, report) for an SDK usability benchmark. Use when running a complete benchmark end-to-end, resuming an interrupted pipeline, or checking pipeline status.

2026-04-27

execute

Analistas de garantia de qualidade de software e testadores

Execute benchmark test cases in sandboxed environments with AI agents. Spins up microsandbox containers for each test case and extracts solutions.

2026-04-27

export

Desenvolvedores de software

Export a benchmark pipeline as a zip file for sharing or archiving. Excludes cache and large snapshots.

2026-04-27

generate

Analistas de garantia de qualidade de software e testadores

Generate SDK usability test cases by exploring source code. Use when creating benchmark test suites, generating test cases for an SDK, or when the user wants to create evaluation scenarios.

2026-04-27

insights

Desenvolvedores de software

Analyze benchmark results and identify SDK improvement areas. Use when reviewing evaluation results, finding failure patterns, identifying documentation gaps, or understanding API design issues.

2026-04-27

inspect

Desenvolvedores de software

Open the web UI to visually inspect, edit, and run the benchmark pipeline. Use when the user wants a visual interface for their pipeline.

2026-04-27

judge

Analistas de garantia de qualidade de software e testadores

Have an LLM judge compare reference and generated solutions, scoring on API discovery, correctness, completeness, and functional correctness.

2026-04-27

report

Analistas de garantia de qualidade de software e testadores

Display a terminal scorecard of benchmark results showing pass rates, scores by difficulty, and per-test breakdowns. Use when the user asks about benchmark results, scores, or wants to see how their SDK performed.

2026-04-27