inspect-action

inspect-action には METR から収集した 7 個の skills があり、リポジトリ単位の職業カバレッジとサイト内 skill 詳細ページを表示します。

METR のプロフィール GitHub で見る

収集済み skills

Stars

更新

2026-03-03

Forks

職業カバレッジ

ソフトウェア開発者ネットワーク・コンピュータシステム管理者コンピュータネットワークサポートスペシャリストソフトウェア品質保証アナリスト・テスターデータベース管理者

5 件の職業カテゴリ · 100% 分類済み

リポジトリエクスプローラー

このリポジトリの skills

クリエイター/リポジトリ/skill

skill

職業分類

説明

更新

debug-stuck-eval

コンピュータネットワークサポートスペシャリスト

Debug stuck Hawk/Inspect AI evaluations. Use when user mentions "stuck eval", "eval not progressing", "eval hanging", "samples not completing", "eval set frozen", "runner stuck", "500 errors in eval", "retry loop", "eval timeout", or asks why an evaluation isn't finishing.

2026-03-03

database-migrations

データベース管理者

Use when creating alembic migrations, applying migrations to remote environments, or recovering from schema drift. Triggers on changes to models.py, "run migration", "schema drift", "alembic", "database error in batch jobs".

2026-02-15

deploy-dev

ネットワーク・コンピュータシステム管理者

Use when deploying code changes to dev environments (dev1-4), running terraform apply against dev, or verifying changes end-to-end. Triggers on "deploy to dev", "apply to dev2", "test in dev", "update dev environment".

2026-02-15

fullstack-dev

ソフトウェア開発者

Use when developing the frontend and backend together, making UI changes, or setting up local dev with linked inspect_ai/scout libraries. Triggers on frontend changes, "yarn dev", "vite", "www/", or React component work.

2026-02-15

smoke-tests

ソフトウェア品質保証アナリスト・テスター

Use when running smoke tests, debugging smoke test failures, or verifying a deployed environment works correctly. Triggers on "run smoke tests", "smoke tests failing", "test against dev", "verify deployment".

2026-02-15

monitoring

ネットワーク・コンピュータシステム管理者

Monitor Hawk job status, view logs, and diagnose issues. Use when the user wants to check job progress, view error logs, debug a failing job, or generate a monitoring report for a Hawk evaluation run.

2026-01-18

view-results

ソフトウェア開発者

View and analyze Hawk evaluation results. Use when the user wants to see eval-set results, check evaluation status, list samples, view transcripts, or analyze agent behavior from a completed evaluation run.

2026-01-18