Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

deploy-embeddings-brev

Name: Deploy Embeddings Brev
Author: NVIDIA-Omniverse

// Deploy or smoke-test a Brev-hosted NVIDIA embedding NIM endpoint for Content Agents. Use when the user asks to deploy embeddings on Brev, host the llama-nemotron-embed-vl model, enable Material Agent prim clustering embeddings, test embedding sidecars, or create a Brev embedding endpoint for the collection deployment.

Ejecutar en Manus

$ git log --oneline --stat

stars:124

forks:13

updated:27 de mayo de 2026, 18:59

Explorador de archivos

2 archivos

SKILL.md

readonly

related-skills.json

mismo repositorio

brev-cli.md

from "NVIDIA-Omniverse/content-agents"

Manage Brev instances safely from the Brev CLI. Use when a workflow needs to list, create, access, copy to, execute on, port-forward, stop, or delete Brev instances for Content Agents dependency endpoints.

2026-05-27124

deploy-collection.md

from "NVIDIA-Omniverse/content-agents"

Deploy the full Content Agents package with a provider-neutral, endpoint-driven, multi-instance strategy. Use when the user asks to deploy all content agents, deploy material/physics/texture together, configure shared OVRTX rendering, choose VLM/LLM/image-gen/embedding endpoints, test the full package, or use Brev as an optional self-hosted dependency provider.

2026-05-27124

deploy-image-gen-brev.md

from "NVIDIA-Omniverse/content-agents"

Deploy or smoke-test a Brev-hosted FLUX image-generation NIM endpoint for Content Agents. Use when the user asks to deploy image generation on Brev, host FLUX.2 Klein for Texture Agent, enable image-gen sidecars, test image-gen endpoints, or create a Brev image-gen endpoint for the collection deployment.

2026-05-27124

deploy-material-agent-brev.md

from "NVIDIA-Omniverse/content-agents"

Deploy or smoke-test the Material Agent with Brev-hosted dependency endpoints. Use when the user asks to test material agent on Brev, deploy material agent with Brev, use RTX/L40S for rendering and A100/H100-grade GPU for VLM, run a Brev hybrid material-agent test, or recreate the Brev material-agent deployment. This workflow keeps the main pipeline local and uses Brev port-forwards to reach OVRTX and Qwen-family VLM endpoints.

2026-05-27124

deploy-material-agent-docker.md

from "NVIDIA-Omniverse/content-agents"

Deploy the material-agent-service locally using Docker Compose with OVRTX rendering. Use when user wants to run material agent with docker, docker compose, set up local deployment, run the service locally with GPU rendering, start material agent containers, or configure VLM provider for docker deployment. Trigger phrases include "docker compose", "docker deploy", "run locally with docker", "start material agent docker", "docker up", "local deployment", "compose up".

2026-05-27124

deploy-ovrtx-docker.md

from "NVIDIA-Omniverse/content-agents"

Deploy just the OVRTX rendering API standalone via Docker Compose so one or more agent CLIs / services can point at it as a shared rendering endpoint. Use when user wants to run OVRTX standalone, deploy a shared rendering service, separate the rendering endpoint from an agent service, expose OVRTX to multiple callers, or set RENDER_ENDPOINT to a dedicated render host. Trigger phrases include "deploy ovrtx", "run ovrtx standalone", "shared rendering service", "ovrtx rendering docker", "render endpoint deploy", "ovrtx docker compose", "standalone rendering".

2026-05-27124

package.json

"author": "NVIDIA-Omniverse"

"repository": "NVIDIA-Omniverse/content-agents"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Desarrolladores de softwareOcupaciones informáticas y matemáticas15-1252L4

name	deploy-embeddings-brev
description	Deploy or smoke-test a Brev-hosted NVIDIA embedding NIM endpoint for Content Agents. Use when the user asks to deploy embeddings on Brev, host the llama-nemotron-embed-vl model, enable Material Agent prim clustering embeddings, test embedding sidecars, or create a Brev embedding endpoint for the collection deployment.
version	0.1.0
author	NVIDIA Content Agents
tags	["content-agents","brev","embeddings","deployment"]
tools	["Shell","Docker","Python","curl","Filesystem","brev","ssh"]
compatibility	Requires Brev CLI access, Docker on the remote GPU instance, NGC_API_KEY for NIM pulls, optional NVIDIA_API_KEY for hosted endpoint tests, and local port-forward access.

Deploy Embeddings With Brev

When to Use

Use this skill for a standalone embedding endpoint that local Content Agent services can reach through an explicit URL.

The golden path mirrors the validated OVRTX Brev path:

Brev type: g6e.xlarge
Instance name: content-embeddings
GPU: one L40S
Image: nvcr.io/nim/nvidia/llama-nemotron-embed-vl-1b-v2:1.12.0
Remote host port: 8004
Local Docker-reachable forward: 0.0.0.0:8014 -> content-embeddings:8004
Collection endpoint: http://host.docker.internal:8014/v1

Use a separate Brev instance from OVRTX unless the user explicitly asks for a single-node fallback. Keep agents CPU-only; this skill only hosts the embedding model sidecar.

Limitations

Keep NGC and NVIDIA credentials in .env or remote env files with restrictive permissions; never print or commit them.
Use a separate Brev instance from OVRTX unless the user explicitly asks for a single-node fallback.
Delete the GPU node after validation unless the user asks to keep it.

Prerequisites

Brev CLI access and an SSH-ready g6e.xlarge or equivalent L40S GPU node.
Docker with enough writable storage for the embedding NIM image and cache.
NGC_API_KEY for the container pull and optional NVIDIA_API_KEY for hosted endpoint compatibility checks.

Instructions

Use brev-cli for generic Brev inventory, dry-run, create, port-forward, stop, delete, and cleanup guardrails.
Create or reuse the Brev node only after the dry-run looks acceptable.
Qualify storage, GPU, and Docker cache location before pulling NIM images.
Copy non-secret repo contents, authenticate to NGC, and start the NIM.
Forward the endpoint, run readiness and embedding smoke tests, wire the collection config, and clean up.

Credit Safety

Start with inventory and a dry-run:

brev ls --json
brev create content-embeddings --dry-run --type g6e.xlarge --min-disk 500

Only after the instance type and spend are acceptable:

brev create content-embeddings --type g6e.xlarge --min-disk 500 --timeout 1200

Wait for shell readiness before copying files or starting containers.

Qualify Storage And GPU

Check that Docker has enough writable disk for the NIM image and model cache:

brev exec content-embeddings \
  "df -h / /opt/dlami/nvme /mnt/* 2>/dev/null || true; \
   lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINTS; \
   docker info --format 'DockerRoot={{.DockerRootDir}}'; \
   nvidia-smi --query-gpu=name,memory.total,driver_version --format=csv,noheader"

On the validated AWS DLAMI-style image, use the ephemeral NVMe for Docker:

brev exec content-embeddings \
  "set -e; \
   sudo mkdir -p /opt/dlami/nvme/docker && \
   sudo env DOCKER_DATA_ROOT=/opt/dlami/nvme/docker python3 - <<'PY'
import json
import os
from pathlib import Path

path = Path('/etc/docker/daemon.json')
text = path.read_text() if path.exists() else ''
data = json.loads(text) if text.strip() else {}
if not isinstance(data, dict):
    raise SystemExit(f'{path} must contain a JSON object')
data['data-root'] = os.environ['DOCKER_DATA_ROOT']
path.write_text(json.dumps(data, indent=2) + '\n')
PY
   sudo nvidia-ctk runtime configure --runtime=docker --set-as-default && \
   sudo systemctl restart docker && \
   docker info --format 'DockerRoot={{.DockerRootDir}}'"

If the requested disk is not reflected in any writable filesystem, stop and choose another provider/type before pulling the NIM image.

Copy The Repo

Copy only non-secret repo contents. Prefer a committed git archive when the embedding Compose file already exists on the current branch:

git archive --format=tar.gz -o /tmp/content-agents-collection.tar.gz HEAD
ssh content-embeddings 'rm -rf ~/collection-deployment && mkdir -p ~/collection-deployment'
scp /tmp/content-agents-collection.tar.gz \
  content-embeddings:~/collection-deployment/
ssh content-embeddings \
  'cd ~/collection-deployment && \
   tar -xzf content-agents-collection.tar.gz && \
   rm content-agents-collection.tar.gz'

For uncommitted testing, use rsync with the same secret/cache exclusions used by the Brev agent skills.

NGC Auth

Keep secrets in the local repo-root .env; never print or commit them.

set -a
source .env
set +a
printf 'NGC_API_KEY=%s\nNVIDIA_API_KEY=%s\n' "$NGC_API_KEY" "$NVIDIA_API_KEY" | \
  ssh content-embeddings 'umask 077; cat > ~/collection-deployment/.env'
printf '%s\n' "$NGC_API_KEY" | \
  ssh content-embeddings \
    'docker login nvcr.io -u \$oauthtoken --password-stdin'

NVIDIA_API_KEY is optional for local no-auth NIM use, but copying it is useful when the same remote .env is reused for hosted endpoint tests.

Start The Embedding NIM

ssh content-embeddings \
  'cd ~/collection-deployment && \
   COLLECTION_EMBEDDINGS_PORT=8004 \
   docker compose -f deploy/collection/docker-compose.embeddings.yml up -d'

Poll readiness:

ssh content-embeddings \
  'for i in $(seq 1 80); do \
     echo attempt=$i status=$(docker inspect -f "{{.State.Health.Status}}" content-embeddings-nim 2>/dev/null || echo missing); \
     curl -fsS http://localhost:8004/v1/health/ready && exit 0; \
     docker logs --tail 30 content-embeddings-nim || true; \
     sleep 30; \
   done; exit 1'

Check model listing:

ssh content-embeddings 'curl -fsS http://localhost:8004/v1/models'

Local Forward And Smoke

For local CLI-only tests, brev port-forward is sufficient:

brev port-forward content-embeddings -p 8014:8004
curl -fsS http://localhost:8014/v1/health/ready

For local Docker agent containers, bind the SSH forward to all host interfaces so host.docker.internal can reach it:

ssh -N -o ExitOnForwardFailure=yes \
  -L 0.0.0.0:8014:localhost:8004 content-embeddings

Smoke image embedding inference:

python3 - <<'PY'
import json
import urllib.request

payload = {
    "model": "nvidia/llama-nemotron-embed-vl-1b-v2",
    "input": [
        "data:image/png;base64,"
        "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0l"
        "EQVR42mP8/x8AAwMCAO+/p9sAAAAASUVORK5CYII="
    ],
    "encoding_format": "float",
    "modality": ["image"],
    "input_type": "passage",
    "truncate": "NONE",
}
request = urllib.request.Request(
    "http://localhost:8014/v1/embeddings",
    data=json.dumps(payload).encode("utf-8"),
    headers={"Content-Type": "application/json"},
    method="POST",
)
with urllib.request.urlopen(request, timeout=60) as response:
    data = json.loads(response.read().decode("utf-8"))
print({"embedding_dim": len(data["data"][0]["embedding"])})
PY

Expected embedding dimension: 2048.

Collection Wiring

In deploy/collection/collection.yaml or an ignored test config:

dependencies:
  embeddings:
    enabled: true
    provider: brev
    endpoint: http://host.docker.internal:8014/v1
    backend: nim
    model: nvidia/llama-nemotron-embed-vl-1b-v2
    api_key: not-used

Then run:

./deploy/collection/deploy.py -c deploy/collection/.brev-test.yaml plan
./deploy/collection/deploy.py -c deploy/collection/.brev-test.yaml up
./deploy/collection/deploy.py -c deploy/collection/.brev-test.yaml smoke
docker exec content-material-agent-service sh -lc \
  'curl -fsS "$MA_CLUSTER_EMBEDDING_BASE_URL/health/ready"'

Output Format

Return the Brev instance name, remote and local endpoint URLs, model ID, smoke test result, collection wiring values, and cleanup command.

Troubleshooting

If readiness fails, report the container health status and recent NIM logs.
If embedding inference fails, include the HTTP status, response body summary, and whether the port-forward or model service is the likely blocker.

Cleanup

Stop local port-forward processes and delete the embedding node after the test unless the user explicitly wants to keep it:

brev delete content-embeddings

If Brev deletion is blocked but SSH still works, power off the instance to stop compute spend while control-plane cleanup is retried:

ssh content-embeddings 'sudo poweroff'

deploy-embeddings-brev

Más de este repositorio

Más de este repositorio

Deploy Embeddings With Brev

When to Use

Limitations

Prerequisites

Instructions

Credit Safety

Qualify Storage And GPU

Copy The Repo

NGC Auth

Start The Embedding NIM

Local Forward And Smoke

Collection Wiring

Output Format

Troubleshooting

Cleanup

Deploy Embeddings With Brev

When to Use

Limitations

Prerequisites

Instructions

Credit Safety

Qualify Storage And GPU

Copy The Repo

NGC Auth

Start The Embedding NIM

Local Forward And Smoke

Collection Wiring

Output Format

Troubleshooting

Cleanup