一键在 Manus 中运行任何 Skill

$pwd:

graphsignal-sdk

Name: Graphsignal Sdk
Author: graphsignal

// Set up and integrate Graphsignal inference observability into Python applications, vLLM, SGLang, and dstack services. Use when the user wants to add profiling, tracing, or monitoring to inference workloads, needs help configuring the Graphsignal SDK, or asks about graphsignal-run, CUPTI, or GPU profiling setup.

在 Manus 中运行

$ git log --oneline --stat

stars:205

forks:11

updated:2026年4月27日 12:46

SKILL.md

readonly

package.json

"author": "graphsignal"

"repository": "graphsignal/graphsignal-python"

打开 GitHub 仓库查看创作者相关仓库

$ install --global

$ download --local

在 Manus 中运行

$ useful --forSOC

软件开发工程师计算机与数学类职业15-1252L4

一键运行任何 Skill

name	graphsignal-sdk
description	Set up and integrate Graphsignal inference observability into Python applications, vLLM, SGLang, and dstack services. Use when the user wants to add profiling, tracing, or monitoring to inference workloads, needs help configuring the Graphsignal SDK, or asks about graphsignal-run, CUPTI, or GPU profiling setup.

Graphsignal SDK

Graphsignal captures performance profiles, traces, metrics, and errors for inference workloads. Auto-instrumentation covers vLLM, SGLang, and PyTorch out of the box.

Install

pip install -U graphsignal

For CUPTI-based GPU profiling on Linux, install the extra matching the CUDA version:

pip install graphsignal[cu12]   # CUDA 12.x
pip install graphsignal[cu13]   # CUDA 13.x

Configure

Option A: In Python code

import graphsignal

graphsignal.configure(api_key='my-api-key')

All configure() args can be set via env vars: GRAPHSIGNAL_API_KEY, GRAPHSIGNAL_DEBUG_MODE, etc.

configure() parameters:

Arg	Env var	Purpose
`api_key`	`GRAPHSIGNAL_API_KEY`	API key (required)
`api_base`	`GRAPHSIGNAL_API_BASE`	On-premise server URL
`tags`	`GRAPHSIGNAL_TAG_{KEY}`	Process-level tags
`auto_instrument`	`GRAPHSIGNAL_AUTO_INSTRUMENT`	Auto-instrument libraries (default `True`)
`debug_mode`	`GRAPHSIGNAL_DEBUG_MODE`	Enable debug logging

Option B: graphsignal-run CLI

Wrap any command — no code changes needed:

export GRAPHSIGNAL_API_KEY="..."
graphsignal-run <my-app>

Integrate with vLLM

Graphsignal automatically instruments vLLM (engine, scheduler, KV cache, attention, output processing, Prometheus metrics).

In a Python app

import graphsignal
graphsignal.configure(api_key='my-api-key')
# then use vLLM normally

vLLM server via graphsignal-run

export GRAPHSIGNAL_API_KEY="..."
graphsignal-run vllm serve Qwen/Qwen1.5-7B-Chat --port 8000

vLLM Docker

docker run --gpus all \
  -p 8000:8000 --ipc=host \
  -e GRAPHSIGNAL_API_KEY=YOUR_API_KEY \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  --entrypoint bash \
  vllm/vllm-openai:latest \
  -lc 'pip install --no-cache-dir graphsignal[cu12] \
       && exec graphsignal-run vllm serve \
           --model Qwen/Qwen2-VL-7B-Instruct \
           --trust-remote-code'

Integrate with SGLang

Graphsignal automatically instruments SGLang (operations, OTEL spans, Prometheus metrics).

In a Python app

import graphsignal
graphsignal.configure(api_key='my-api-key')
# then use SGLang normally

SGLang server via graphsignal-run

export GRAPHSIGNAL_API_KEY="..."
graphsignal-run sglang serve \
  --model-path Qwen/Qwen1.5-7B-Chat \
  --port 8000

SGLang Docker

docker run --gpus all \
  -p 8000:8000 --ipc=host \
  -e GRAPHSIGNAL_API_KEY=YOUR_API_KEY \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  --entrypoint bash \
  your-sglang-image:latest \
  -lc 'pip install --no-cache-dir graphsignal[cu12] \
       && exec graphsignal-run sglang serve \
           --model-path Qwen/Qwen2.5-1.5B-Instruct \
           --port 8000'

Integrate via dstack

dstack runs inference as services. Use graphsignal-run around the server command, same as bare-metal.

dstack service config (SGLang example)

type: service
name: deepseek-r1

image: lmsysorg/sglang:latest
env:
  - MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Llama-8B
  - GRAPHSIGNAL_API_KEY

commands:
  - |
    pip install --no-cache-dir 'graphsignal[cu12]' && \
    graphsignal-run python3 -m sglang.launch_server \
      --model-path $MODEL_ID \
      --port 8000 \
      --trust-remote-code

port: 8000
model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B

resources:
  gpu: 24GB

Deploy:

dstack apply -f service.dstack.yml

If your Docker image already includes Graphsignal, skip the pip install step. If using the sglang CLI instead of launch_server, use graphsignal-run sglang serve ....

Manual Tracing

For code not auto-instrumented, use trace() or @trace_function:

with graphsignal.trace('inference') as span:
    span.set_counter('prompt_tokens', prompt_tokens)
    span.set_counter('completion_tokens', completion_tokens)

    with span.trace('model-cold-boot') as sub_span:
        ...

@graphsignal.trace_function
def my_function():
    ...

Record counter metrics within a trace:

with graphsignal.trace('my-function') as span:
    span.inc_counter_metric('call_cost', price_per_call)

Manual Profiling

Profile specific Python functions (3.12+):

graphsignal.profile_function(func=slow_transform, category='transform', op_name='data-transform')

Profile by import path (avoids direct import):

graphsignal.profile_function_path(path='myapp.tasks.prepare_data', category='preprocessing')

Profile CUDA kernels by pattern (Linux, CUPTI required):

graphsignal.profile_cuda_kernel(kernel_pattern="cublas", op_name="matmul_gemm")

Troubleshooting

Enable debug logging: graphsignal.configure(debug_mode=True) or GRAPHSIGNAL_DEBUG_MODE=true.

Ensure outgoing connections to https://api.graphsignal.com are allowed.

Reference

Full Python API: https://graphsignal.com/docs/reference/python-api/
vLLM integration: https://graphsignal.com/docs/integrations/vllm/
SGLang integration: https://graphsignal.com/docs/integrations/sglang/
dstack integration: https://graphsignal.com/docs/integrations/dstack/

name	graphsignal-sdk
description	Set up and integrate Graphsignal inference observability into Python applications, vLLM, SGLang, and dstack services. Use when the user wants to add profiling, tracing, or monitoring to inference workloads, needs help configuring the Graphsignal SDK, or asks about graphsignal-run, CUPTI, or GPU profiling setup.

Graphsignal SDK

Graphsignal captures performance profiles, traces, metrics, and errors for inference workloads. Auto-instrumentation covers vLLM, SGLang, and PyTorch out of the box.

Install

pip install -U graphsignal

For CUPTI-based GPU profiling on Linux, install the extra matching the CUDA version:

pip install graphsignal[cu12]   # CUDA 12.x
pip install graphsignal[cu13]   # CUDA 13.x

Configure

Option A: In Python code

import graphsignal

graphsignal.configure(api_key='my-api-key')

All configure() args can be set via env vars: GRAPHSIGNAL_API_KEY, GRAPHSIGNAL_DEBUG_MODE, etc.

configure() parameters:

Arg	Env var	Purpose
`api_key`	`GRAPHSIGNAL_API_KEY`	API key (required)
`api_base`	`GRAPHSIGNAL_API_BASE`	On-premise server URL
`tags`	`GRAPHSIGNAL_TAG_{KEY}`	Process-level tags
`auto_instrument`	`GRAPHSIGNAL_AUTO_INSTRUMENT`	Auto-instrument libraries (default `True`)
`debug_mode`	`GRAPHSIGNAL_DEBUG_MODE`	Enable debug logging

Option B: graphsignal-run CLI

Wrap any command — no code changes needed:

export GRAPHSIGNAL_API_KEY="..."
graphsignal-run <my-app>

Integrate with vLLM

Graphsignal automatically instruments vLLM (engine, scheduler, KV cache, attention, output processing, Prometheus metrics).

In a Python app

import graphsignal
graphsignal.configure(api_key='my-api-key')
# then use vLLM normally

vLLM server via graphsignal-run

export GRAPHSIGNAL_API_KEY="..."
graphsignal-run vllm serve Qwen/Qwen1.5-7B-Chat --port 8000

vLLM Docker

docker run --gpus all \
  -p 8000:8000 --ipc=host \
  -e GRAPHSIGNAL_API_KEY=YOUR_API_KEY \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  --entrypoint bash \
  vllm/vllm-openai:latest \
  -lc 'pip install --no-cache-dir graphsignal[cu12] \
       && exec graphsignal-run vllm serve \
           --model Qwen/Qwen2-VL-7B-Instruct \
           --trust-remote-code'

Integrate with SGLang

Graphsignal automatically instruments SGLang (operations, OTEL spans, Prometheus metrics).

In a Python app

import graphsignal
graphsignal.configure(api_key='my-api-key')
# then use SGLang normally

SGLang server via graphsignal-run

export GRAPHSIGNAL_API_KEY="..."
graphsignal-run sglang serve \
  --model-path Qwen/Qwen1.5-7B-Chat \
  --port 8000

SGLang Docker

docker run --gpus all \
  -p 8000:8000 --ipc=host \
  -e GRAPHSIGNAL_API_KEY=YOUR_API_KEY \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  --entrypoint bash \
  your-sglang-image:latest \
  -lc 'pip install --no-cache-dir graphsignal[cu12] \
       && exec graphsignal-run sglang serve \
           --model-path Qwen/Qwen2.5-1.5B-Instruct \
           --port 8000'

Integrate via dstack

dstack runs inference as services. Use graphsignal-run around the server command, same as bare-metal.

dstack service config (SGLang example)

type: service
name: deepseek-r1

image: lmsysorg/sglang:latest
env:
  - MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Llama-8B
  - GRAPHSIGNAL_API_KEY

commands:
  - |
    pip install --no-cache-dir 'graphsignal[cu12]' && \
    graphsignal-run python3 -m sglang.launch_server \
      --model-path $MODEL_ID \
      --port 8000 \
      --trust-remote-code

port: 8000
model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B

resources:
  gpu: 24GB

Deploy:

dstack apply -f service.dstack.yml

If your Docker image already includes Graphsignal, skip the pip install step. If using the sglang CLI instead of launch_server, use graphsignal-run sglang serve ....

Manual Tracing

For code not auto-instrumented, use trace() or @trace_function:

with graphsignal.trace('inference') as span:
    span.set_counter('prompt_tokens', prompt_tokens)
    span.set_counter('completion_tokens', completion_tokens)

    with span.trace('model-cold-boot') as sub_span:
        ...

@graphsignal.trace_function
def my_function():
    ...

Record counter metrics within a trace:

with graphsignal.trace('my-function') as span:
    span.inc_counter_metric('call_cost', price_per_call)

Manual Profiling

Profile specific Python functions (3.12+):

graphsignal.profile_function(func=slow_transform, category='transform', op_name='data-transform')

Profile by import path (avoids direct import):

graphsignal.profile_function_path(path='myapp.tasks.prepare_data', category='preprocessing')

Profile CUDA kernels by pattern (Linux, CUPTI required):

graphsignal.profile_cuda_kernel(kernel_pattern="cublas", op_name="matmul_gemm")

Troubleshooting

Enable debug logging: graphsignal.configure(debug_mode=True) or GRAPHSIGNAL_DEBUG_MODE=true.

Ensure outgoing connections to https://api.graphsignal.com are allowed.

Reference

Full Python API: https://graphsignal.com/docs/reference/python-api/
vLLM integration: https://graphsignal.com/docs/integrations/vllm/
SGLang integration: https://graphsignal.com/docs/integrations/sglang/
dstack integration: https://graphsignal.com/docs/integrations/dstack/

graphsignal-sdk

Graphsignal SDK

Install

Configure

Option A: In Python code

Option B: graphsignal-run CLI

Integrate with vLLM

In a Python app

vLLM server via graphsignal-run

vLLM Docker

Integrate with SGLang

In a Python app

SGLang server via graphsignal-run

SGLang Docker

Integrate via dstack

dstack service config (SGLang example)

Manual Tracing

Manual Profiling

Tags

Troubleshooting

Reference

Graphsignal SDK

Install

Configure

Option A: In Python code

Option B: graphsignal-run CLI

Integrate with vLLM

In a Python app

vLLM server via graphsignal-run

vLLM Docker

Integrate with SGLang

In a Python app

SGLang server via graphsignal-run

SGLang Docker

Integrate via dstack

dstack service config (SGLang example)

Manual Tracing

Manual Profiling

Tags

Troubleshooting

Reference