Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

Dépôt GitHub

agentic-usability

agentic-usability contient 10 skills collectées depuis PSPDFKit-labs, avec une couverture métier par dépôt et des pages de détail sur le site.

Profil PSPDFKit-labs Voir sur GitHub

skills collectés

Stars

mis à jour

2026-05-14

Forks

Couverture métier

Analystes en assurance qualité des logiciels et testeurs Développeurs de logiciels Administrateurs de réseaux et de systèmes informatiques

3 catégories métier · 100% classifié

explorateur de dépôts

Skills dans ce dépôt

créateur/dépôt/skill

skill

métier

description

mis à jour

init

Développeurs de logiciels

Initialize a new agentic-usability benchmark pipeline project. Use when setting up a new SDK benchmark, creating a config.json, or starting a new evaluation project.

2026-05-14

sandbox

Administrateurs de réseaux et de systèmes informatiques

Launch an interactive shell inside a microsandbox for debugging. Supports bare mode, executor setup, or judge setup with optional test case scaffolding.

2026-05-14

eval

Analystes en assurance qualité des logiciels et testeurs

Run the full evaluation pipeline (execute, judge, report) for an SDK usability benchmark. Use when running a complete benchmark end-to-end, resuming an interrupted pipeline, or checking pipeline status.

2026-04-27

execute

Analystes en assurance qualité des logiciels et testeurs

Execute benchmark test cases in sandboxed environments with AI agents. Spins up microsandbox containers for each test case and extracts solutions.

2026-04-27

export

Développeurs de logiciels

Export a benchmark pipeline as a zip file for sharing or archiving. Excludes cache and large snapshots.

2026-04-27

generate

Analystes en assurance qualité des logiciels et testeurs

Generate SDK usability test cases by exploring source code. Use when creating benchmark test suites, generating test cases for an SDK, or when the user wants to create evaluation scenarios.

2026-04-27

insights

Développeurs de logiciels

Analyze benchmark results and identify SDK improvement areas. Use when reviewing evaluation results, finding failure patterns, identifying documentation gaps, or understanding API design issues.

2026-04-27

inspect

Développeurs de logiciels

Open the web UI to visually inspect, edit, and run the benchmark pipeline. Use when the user wants a visual interface for their pipeline.

2026-04-27

judge

Analystes en assurance qualité des logiciels et testeurs

Have an LLM judge compare reference and generated solutions, scoring on API discovery, correctness, completeness, and functional correctness.

2026-04-27

report

Analystes en assurance qualité des logiciels et testeurs

Display a terminal scorecard of benchmark results showing pass rates, scores by difficulty, and per-test breakdowns. Use when the user asks about benchmark results, scores, or wants to see how their SDK performed.

2026-04-27