Create consistent nn_cfd W&B experiment comparison reports from selected run IDs using wandb and wandb_workspaces. Use this whenever the user asks to report experiment results, create a W&B Report, compare CFD training/eval runs, document baseline-vs-variant outcomes, reproduce the project experiment report template, or publish a leaderboard-style report for milieu/nn_cfd.

2026-06-14

alphaxiv-paper-lookup

물리학자

Look up any arxiv paper on alphaxiv.org to get a structured AI-generated overview. This is faster and more reliable than trying to read a raw PDF.

2026-06-12

web-search-advanced-research-paper

기타 물리과학자

Search for research papers and academic content using Exa advanced search. Full filter support including date ranges and text filtering. Use when searching for academic papers, arXiv preprints, or scientific research.

2026-06-12

alphaxiv-paper-lookup

물리학자

Look up any arxiv paper on alphaxiv.org to get a structured AI-generated overview. This is faster and more reliable than trying to read a raw PDF.

2026-06-12

web-search-advanced-research-paper

Create or improve Senpai target-repository onboarding files: program.md plus instructions/prompt-advisor.md and instructions/prompt-student.md. Use this skill whenever the user wants to point Senpai at a fresh ML or research target repository, define the research objective, primary metric, benchmark contract, allowed edit boundaries, W&B reporting contract, advisor/student prompts, or prepare a repo for autonomous advisor/student experiment loops.

2026-06-12

check-human-issues

소프트웨어 개발자

Check and respond to GitHub Issues from the human researcher team. Runs in a forked context (no access to main conversation). Use this skill whenever you need to: check for human messages, respond to human issues, poll for team communications, check GitHub issues. Also triggers for: "any human messages?", "check issues", "respond to humans".

2026-06-12

senpai-gh

소프트웨어 개발자

GitHub CLI primitives for the senpai research workflow — label swaps, send-back, close, mark-review, issue checks, PR queries. Use this skill whenever you need to manipulate PR labels, send a PR back to a student, close a dead-end experiment, mark a PR for review, or query the current state of PRs and issues. Also triggers for: "swap labels", "send back to student", "close this PR", "mark for review", "check human issues", "list review-ready PRs", "idle students".

2026-06-12

이 저장소에서 수집된 skills 22개 중 상위 8개를 표시합니다.

#002

weave

skills 3개1.1k1562026-07-01 업데이트

제작자 내 7.7%

skill

직업 분류

설명

업데이트

bump-version

소프트웨어 개발자

Bump the Weave Python SDK version for release. Use when preparing a new release.

2026-07-01

weave-instrument

소프트웨어 개발자

Add Weave (Weights & Biases) observability to an LLM or agent codebase. This covers calling `weave.init()`, setting up authentication, and choosing between OTEL auto-instrumentation and the explicit Session SDK agent-logging APIs (Turn, LLM, Tool, SubAgent), based on the libraries the code already uses. Works for Python and TypeScript/Node. Use this whenever the user wants to instrument, trace, or add observability, logging, or monitoring to an agent, chatbot, RAG pipeline, or LLM app. This includes phrasings like "log my agent to weave", "add agent tracing", "get my agent into the Weave Agents tab", "instrument this with the weave session sdk", "trace my tool calls", or "set up weave logging", even when the user does not name a specific API.

2026-06-30

publish-pypi

소프트웨어 개발자

Build and publish the Weave Python SDK to PyPI. Use when releasing a new version.

2026-01-21

#003

weave-claude-code

skills 3개902026-06-22 업데이트

제작자 내 7.7%

skill

직업 분류

설명

업데이트

weave-config

소프트웨어 개발자

This skill should be used when the user wants to "configure weave", "set weave project", "change weave project", "set wandb api key", "update weave settings", "show weave config", "change weave configuration", "restart the weave daemon", "apply weave config changes", "restart weave to pick up changes", or needs to read or update any Weave Claude Code plugin settings.

2026-06-22

weave-status

소프트웨어 개발자

This skill should be used when the user wants to "check weave status", "verify the weave plugin is running", "see if weave is set up correctly", "check weave configuration", "is weave working", "weave is running an older config", "the daemon is on an old config", or needs to diagnose why Claude Code sessions are not appearing in Weave.

2026-06-22

weave-install

소프트웨어 개발자

This skill should be used when the user wants to "install the weave plugin", "set up weave", "install weave-claude-code", "configure weave for the first time", "get started with weave tracing", or needs to complete the initial setup of the Weave Claude Code plugin including dependency installation and project configuration.

2026-06-12

#004

wsm

skills 3개312026-07-17 업데이트

제작자 내 7.7%

skill

직업 분류

설명

업데이트

ship

소프트웨어 개발자

Branch, commit, and/or open a PR for the current changes — driven by AskUserQuestion. Use when the user wants to commit their work, create a branch, open a pull request (draft or ready), refine an existing PR's description, set the release label, or "ship" / "wrap up" what they've been working on. Produces conventional commits and a PR that fits the repo template and passes the PR Checks gate.

2026-07-17

audit-operator-diff

소프트웨어 품질 보증 분석가·테스터

Audit changes between two github.com/wandb/operator refs and produce a categorized impact report on wsm. Use independently for code review / release planning, or as Step 1 of /sync-operator. Triggers on phrases like "audit operator diff", "what changed in operator", "show me operator impact on wsm", "operator compatibility audit". Does NOT make code changes — produces a fix list at .claude/audit-report.md that the user (or /sync-operator) consumes.

2026-06-02

sync-operator

소프트웨어 개발자

Sync wsm to be e2e compatible with a target github.com/wandb/operator ref — always invokes /audit-operator-diff first, then applies the user-approved fix list, updates docs, and smoke-tests on Kind. Triggers on phrases like "sync wsm with operator", "bump operator dep", "update for new operator release", "match operator changes". v2 surface only by default; v1 only on explicit user request. Requires a local clone of the operator repo and OrbStack running.

2026-06-02

#005

discovery-forge

skills 3개202026-06-14 업데이트

제작자 내 7.7%

skill

직업 분류

설명

업데이트

annotation-improvement

소프트웨어 품질 보증 분석가·테스터

Guides coding agents through Discovery Forge prompt improvement after selecting the `wandb-primary` skill. Use `wandb-primary` to fetch W&B Weave research_run traces, human annotations, runnable feedback, and evaluations, then use this workflow when improving researcher.md from annotation queues, reviewed research traces, or human feedback.

2026-06-14

offline-eval-improvement

소프트웨어 개발자

Guides coding agents through Discovery Forge prompt improvement from a specified Weave offline evaluation baseline. Use `wandb-primary` to fetch evaluation results, failed eval rows, dataset refs, prompt refs, and scorer evidence, then use this workflow when improving researcher.md from failed evaluation rows, comparing eval runs, or iterating on a fixed dataset.

2026-06-14

build-verdict-dataset

소프트웨어 개발자

Guides coding agents through building the verdict_quality_dataset from W&B Weave research_annotation evidence — querying annotated research_run calls, mapping human QualitySelector verdicts to gold labels, refining row inputs per the rubric, and publishing a new versioned Weave Dataset. Use when the user asks to (re)generate the verdict dataset from annotations, seed a new eval dataset version, or rebuild verdict_quality_dataset.

2026-06-13

#006

llm-leaderboard-korean

skills 2개302026-04-15 업데이트

제작자 내 5.1%

skill

직업 분류

설명

업데이트

horangi-analyze

데이터 과학자

Analyze one or more Horangi model configs against (1) HuggingFace model card claims, (2) the W&B leaderboard rankings, and (3) category-level peer comparison. Invoke when the user asks to analyze/compare model performance for a config (e.g. "analyze <config>", "compare X and Y", "이 모델 성능 분석해줘").

2026-04-15

horangi-fails

데이터 과학자

Deep-dive error pattern analysis for a single (model, benchmark) pair using Weave traces. Surfaces how/why the model is getting answers wrong — answer bias, format violations, language mixing, and 3-5 representative failure samples. Invoke when the user asks to analyze wrong answers / failure patterns for a specific benchmark (e.g. "analyze errors in <bench>", "<model>의 <benchmark> 오답 패턴 분석", "틀린 문제 경향"). Commonly invoked as a follow-up to `horangi-analyze` when that skill flags a weak category.

2026-04-15

#007

skills

skills 1개6122026-06-29 업데이트

제작자 내 2.6%

skill

직업 분류

설명

업데이트

wandb-primary

데이터 과학자

Primary W&B skill for broad or mixed Weights & Biases work: project overviews, W&B runs and artifacts, Weave traces and evaluations, Reports, and Launch workflows. Use when the task spans multiple W&B surfaces or the user asks generally what is happening in a W&B project.

2026-06-29

#008

operator

skills 1개812026-07-15 업데이트

제작자 내 2.6%

skill

직업 분류

설명

업데이트

convert-datadog-dashboard

소프트웨어 개발자

Convert a DataDog dashboard JSON into a Grafana dashboard JSON that fits the operator's telemetry stack (VictoriaMetrics datasources, wandb-<component> UID, no templating). Trigger when the user provides DataDog dashboard JSON (inline or as a file path) and asks to port, convert, or recreate it as a Grafana dashboard, or when the user asks to add a new dashboard under deploy/telemetry/dashboards/ from a DataDog source.

2026-07-15

#009

weave-integration-skills

skills 1개202026-04-07 업데이트

제작자 내 2.6%

skill

직업 분류

설명

업데이트

weave-integration

소프트웨어 개발자

Comprehensive skill for adding W&B Weave to existing applications. Covers trace-first instrumentation, evaluation only after trace verification, documentation-first implementation, CLI-based Weave data access, and validation workflows. Activate this skill only when the user explicitly mentions the skill by name, such as `weave-integration`, `@skills weave-integration`, or another direct skill reference. Do not auto-trigger from generic Weave or W&B questions.

2026-04-07

저장소 9개 중 9개 표시

모든 저장소를 표시했습니다