Execute qualquer Skill no Manus
com um clique

Execute qualquer Skill no Manus com um clique

Começar

$pwd:

wild-v2-gpu-discovery-parallel-scheduling

Name: Wild V2 Gpu Discovery Parallel Scheduling
Author: hao-ai-lab

// Protocol for GPU discovery and parallel run scheduling across local GPU and Slurm clusters

Executar no Manus

$ git log --oneline --stat

stars:17

forks:5

updated:20 de fevereiro de 2026 às 04:04

SKILL.md

readonly

name	wild_v2_gpu_discovery_parallel_scheduling
description	Protocol for GPU discovery and parallel run scheduling across local GPU and Slurm clusters
category	protocol
variables	[]

Wild V2 GPU Discovery and Parallel Scheduling Protocol

Use this protocol before launching experiment grids.

Goals

Discover available compute topology (local GPU vs Slurm vs CPU-only).
Decide safe parallelism.
Launch multiple runs in parallel when capacity exists.

Discovery flow

Auto-detect cluster:

curl -X POST "$SERVER_URL/cluster/detect" -H "X-Auth-Token: $AUTH_TOKEN"

Read cluster and run summary:

curl -X GET "$SERVER_URL/cluster" -H "X-Auth-Token: $AUTH_TOKEN"
curl -X GET "$SERVER_URL/wild/v2/system-health" -H "X-Auth-Token: $AUTH_TOKEN"

Use:

cluster.type (local_gpu, slurm, cpu_only, ...)
cluster.gpu_count
run_summary / system-health running+queued counts

Parallel scheduling policy

Local GPU

If gpu_count = N, target at most N concurrently running GPU jobs, unless user explicitly wants oversubscription.
Prefer one run per GPU.
Encode device selection in each run command, for example:
- CUDA_VISIBLE_DEVICES=0 ...
- CUDA_VISIBLE_DEVICES=1 ...

Slurm

Use scheduler flags per run (--gres=gpu:1, partition/account/qos as available).
Submit multiple runs to queue; let scheduler place them.
Keep run commands explicit and reproducible.

CPU-only

Do not force GPU flags.
Parallelize conservatively based on available cores and memory.

Grid execution rule

For grid search, create one run per configuration using API. Launch multiple configurations in the same iteration when capacity allows.

Do not serialize the entire grid one task at a time when idle capacity is available.

Safety and auditability

Every run must be created via POST /runs (with sweep_id).
Prefer auto_start=true when safe parallel capacity exists.
If capacity is constrained, create runs with auto_start=false and start selected runs via POST /runs/{id}/start.
Never bypass API tracking with direct local experiment execution.

related-skills.json

mesmo repositório

alert-handler.md

from "hao-ai-lab/research-agent"

System prompt for handling experiment alerts. Provides diagnosis guidance, GPU wrapper context, action suggestions, and structured response from allowed choices.

2026-02-2017

agent-mode-research-assistant.md

from "hao-ai-lab/research-agent"

Default system prompt for agent chat mode. Provides identity, environment context, compute awareness, API-driven job submission, and workflow reflection.

2026-02-2017

plan-mode-planning-assistant.md

from "hao-ai-lab/research-agent"

Generates a structured experiment plan with compute-aware recommendations and saves it via the plan API endpoint.

2026-02-2017

wild-v2-steer.md

from "hao-ai-lab/research-agent"

Wraps user steering input with context signals for the model during a wild loop session

2026-02-2017

fastvideo-model-porting-alignment.md

from "hao-ai-lab/research-agent"

Ports new models into FastVideo with strict numerical alignment to official implementations. Use when adding a FastVideo model/pipeline, porting an official or Diffusers checkpoint, or debugging parity/alignment.

2026-02-2017

wild-v2-execution-ops-protocol.md

from "hao-ai-lab/research-agent"

Single source of truth protocol for Wild V2 preflight, sweep/run auditability, GPU discovery, and parallel scheduling

2026-02-2017

package.json

"author": "hao-ai-lab"

"repository": "hao-ai-lab/research-agent"

Abrir repositório GitHub Ver repositórios do creator

$ install --global

$ download --local

Executar no Manus

$ useful --forSOC

Administradores de redes e sistemas de computadorInformática e Matemática15-1244L4

name	wild_v2_gpu_discovery_parallel_scheduling
description	Protocol for GPU discovery and parallel run scheduling across local GPU and Slurm clusters
category	protocol
variables	[]

Wild V2 GPU Discovery and Parallel Scheduling Protocol

Use this protocol before launching experiment grids.

Goals

Discover available compute topology (local GPU vs Slurm vs CPU-only).
Decide safe parallelism.
Launch multiple runs in parallel when capacity exists.

Discovery flow

Auto-detect cluster:

curl -X POST "$SERVER_URL/cluster/detect" -H "X-Auth-Token: $AUTH_TOKEN"

Read cluster and run summary:

curl -X GET "$SERVER_URL/cluster" -H "X-Auth-Token: $AUTH_TOKEN"
curl -X GET "$SERVER_URL/wild/v2/system-health" -H "X-Auth-Token: $AUTH_TOKEN"

Use:

cluster.type (local_gpu, slurm, cpu_only, ...)
cluster.gpu_count
run_summary / system-health running+queued counts

Parallel scheduling policy

Local GPU

If gpu_count = N, target at most N concurrently running GPU jobs, unless user explicitly wants oversubscription.
Prefer one run per GPU.
Encode device selection in each run command, for example:
- CUDA_VISIBLE_DEVICES=0 ...
- CUDA_VISIBLE_DEVICES=1 ...

Slurm

Use scheduler flags per run (--gres=gpu:1, partition/account/qos as available).
Submit multiple runs to queue; let scheduler place them.
Keep run commands explicit and reproducible.

CPU-only

Do not force GPU flags.
Parallelize conservatively based on available cores and memory.

Grid execution rule

For grid search, create one run per configuration using API. Launch multiple configurations in the same iteration when capacity allows.

Do not serialize the entire grid one task at a time when idle capacity is available.

Safety and auditability

Every run must be created via POST /runs (with sweep_id).
Prefer auto_start=true when safe parallel capacity exists.
If capacity is constrained, create runs with auto_start=false and start selected runs via POST /runs/{id}/start.
Never bypass API tracking with direct local experiment execution.

wild-v2-gpu-discovery-parallel-scheduling

Wild V2 GPU Discovery and Parallel Scheduling Protocol

Goals

Discovery flow

Parallel scheduling policy

Local GPU

Slurm

CPU-only

Grid execution rule

Safety and auditability

Mais deste repositório

Mais deste repositório

Wild V2 GPU Discovery and Parallel Scheduling Protocol

Goals

Discovery flow

Parallel scheduling policy

Local GPU

Slurm

CPU-only

Grid execution rule

Safety and auditability