Skip to main content
Manusで任意のスキルを実行
ワンクリックで
GitHub リポジトリ

inspect-action

inspect-action には METR から収集した 7 個の skills があり、リポジトリ単位の職業カバレッジとサイト内 skill 詳細ページを表示します。

収集済み skills
7
Stars
23
更新
2026-03-03
Forks
11
職業カバレッジ
5 件の職業カテゴリ · 100% 分類済み
リポジトリエクスプローラー

このリポジトリの skills

debug-stuck-eval
コンピュータネットワークサポートスペシャリスト

Debug stuck Hawk/Inspect AI evaluations. Use when user mentions "stuck eval", "eval not progressing", "eval hanging", "samples not completing", "eval set frozen", "runner stuck", "500 errors in eval", "retry loop", "eval timeout", or asks why an evaluation isn't finishing.

2026-03-03
database-migrations
データベース管理者

Use when creating alembic migrations, applying migrations to remote environments, or recovering from schema drift. Triggers on changes to models.py, "run migration", "schema drift", "alembic", "database error in batch jobs".

2026-02-15
deploy-dev
ネットワーク・コンピュータシステム管理者

Use when deploying code changes to dev environments (dev1-4), running terraform apply against dev, or verifying changes end-to-end. Triggers on "deploy to dev", "apply to dev2", "test in dev", "update dev environment".

2026-02-15
fullstack-dev
ソフトウェア開発者

Use when developing the frontend and backend together, making UI changes, or setting up local dev with linked inspect_ai/scout libraries. Triggers on frontend changes, "yarn dev", "vite", "www/", or React component work.

2026-02-15
smoke-tests
ソフトウェア品質保証アナリスト・テスター

Use when running smoke tests, debugging smoke test failures, or verifying a deployed environment works correctly. Triggers on "run smoke tests", "smoke tests failing", "test against dev", "verify deployment".

2026-02-15
monitoring
ネットワーク・コンピュータシステム管理者

Monitor Hawk job status, view logs, and diagnose issues. Use when the user wants to check job progress, view error logs, debug a failing job, or generate a monitoring report for a Hawk evaluation run.

2026-01-18
view-results
ソフトウェア開発者

View and analyze Hawk evaluation results. Use when the user wants to see eval-set results, check evaluation status, list samples, view transcripts, or analyze agent behavior from a completed evaluation run.

2026-01-18