تشغيل أي مهارة في Manus بنقرة واحدة

vector-database-ops

Deploy, manage, and optimize vector databases for AI applications. Covers Qdrant, Weaviate, pgvector, and Pinecone — collection management, indexing strategies, backup, and performance tuning for production RAG and semantic search workloads.

تشغيل في Manus

النجوم٣٠

التفرعات٤

آخر تحديث٢٢ مايو ٢٠٢٦ في ١٣:٠٢

المصدر

BagelHole

BagelHole/DevOps-Security-Agent-Skills

فتح مستودع GitHub عرض مستودعات المنشئ

أمر التثبيت

تنزيل

تشغيل في Manus

مفيد لـSOC

مديرو قواعد البياناتمهن الحاسوب والرياضيات15-1242L4

SKILL.md

readonly

المزيد من هذا المستودع

نفس المستودع

ai-pipeline-orchestration

BagelHole/DevOps-Security-Agent-Skills

Orchestrate AI/ML pipelines for data ingestion, model training, batch inference, and RAG indexing using Prefect, Airflow, or Dagster. Build reliable, observable, and retriable workflows for production AI systems.

2026-05-2230

llm-caching

BagelHole/DevOps-Security-Agent-Skills

Implement multi-layer LLM caching with exact match, semantic similarity, and provider-side prompt caching. Reduce API costs by 30–70%, cut latency, and improve throughput using Redis, GPTCache, and provider caching APIs.

2026-05-2230

llm-cost-optimization

BagelHole/DevOps-Security-Agent-Skills

Reduce LLM API and infrastructure costs through model selection, prompt caching, batching, caching, quantization, and self-hosting strategies. Track spend by team and model, set budgets, and implement cost-aware routing.

2026-05-2230

sre-dashboards

BagelHole/DevOps-Security-Agent-Skills

Design and operationalize SRE dashboards that surface reliability, latency, error, saturation, and capacity signals across services. Use when building observability views for SLOs, incident response, and executive reliability reporting.

2026-05-2230

model-serving-kubernetes

BagelHole/DevOps-Security-Agent-Skills

Deploy ML models on Kubernetes with KServe (formerly KFServing) and NVIDIA Triton Inference Server. Includes canary deployments, autoscaling, model versioning, A/B testing, and GPU resource management for production model serving.

2026-05-2230

openclaw-security-hardening

BagelHole/DevOps-Security-Agent-Skills

Harden OpenClaw self-hosted environments with baseline host controls, auth tightening, secret handling, network segmentation, and safe update/rollback workflows. Use when deploying OpenClaw in home labs, startups, or production-like local AI infrastructure.

2026-05-2230

name	vector-database-ops
description	Deploy, manage, and optimize vector databases for AI applications. Covers Qdrant, Weaviate, pgvector, and Pinecone — collection management, indexing strategies, backup, and performance tuning for production RAG and semantic search workloads.
license	MIT
metadata	{"author":"devops-skills","version":"1.0"}

Vector Database Operations

Run production vector databases for AI-powered search, RAG, and recommendation systems.

When to Use This Skill

Use this skill when:

Setting up a vector database for a RAG or semantic search application
Choosing between Qdrant, Weaviate, pgvector, or Pinecone
Managing collections, indexes, and data migrations
Optimizing query performance and indexing for production loads
Implementing multi-tenant vector search with namespace isolation

Vector Database Comparison

Database	Best For	Hosting	Filtering	Scale
Qdrant	High-performance, rich filtering, self-hosted	Self / Cloud	Excellent	Very High
Weaviate	Schema-first, hybrid search, multi-modal	Self / Cloud	Good	High
pgvector	Already on Postgres, simple use cases	Self	Good	Medium
Pinecone	Zero-ops managed, serverless	Managed only	Good	Very High
Chroma	Local dev, prototyping	Self only	Basic	Low-Medium

Qdrant — Production Deployment

# Docker (single node)
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -p 6334:6334 \
  -v $(pwd)/qdrant-data:/qdrant/storage \
  qdrant/qdrant:latest

# With custom config
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -v $(pwd)/qdrant-data:/qdrant/storage \
  -v $(pwd)/qdrant-config.yaml:/qdrant/config/production.yaml \
  qdrant/qdrant:latest

# qdrant-config.yaml
storage:
  storage_path: /qdrant/storage
  on_disk_payload: true          # store payload on disk (saves RAM)

service:
  max_request_size_mb: 32

hnsw_index:
  m: 16                          # graph connections per node
  ef_construct: 100              # accuracy vs build time trade-off
  full_scan_threshold: 10000     # switch to brute force below this

quantization:
  scalar:
    type: int8
    quantile: 0.99
    always_ram: true             # keep quantized index in RAM

telemetry_disabled: true

Qdrant Collection Management

from qdrant_client import QdrantClient
from qdrant_client.models import (
    Distance, VectorParams, HnswConfigDiff,
    ScalarQuantizationConfig, ScalarType, QuantizationConfig
)

client = QdrantClient("http://localhost:6333")

# Create optimized collection
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(
        size=1536,                         # OpenAI ada-002 / text-embedding-3-small
        distance=Distance.COSINE,
        on_disk=True,                      # save RAM — vectors stored on disk
    ),
    hnsw_config=HnswConfigDiff(
        m=32,                              # higher = better recall, more RAM
        ef_construct=200,
        on_disk=False,                     # keep HNSW graph in RAM for speed
    ),
    quantization_config=QuantizationConfig(
        scalar=ScalarQuantizationConfig(
            type=ScalarType.INT8,
            quantile=0.99,
            always_ram=True,
        )
    ),
)

# Create payload index for fast filtering
client.create_payload_index(
    collection_name="documents",
    field_name="tenant_id",
    field_schema="keyword",
)
client.create_payload_index(
    collection_name="documents",
    field_name="created_at",
    field_schema="datetime",
)

# Collection info
info = client.get_collection("documents")
print(f"Vectors: {info.vectors_count}, Status: {info.status}")

Qdrant Filtered Search

from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

# Tenant-isolated search (multi-tenant RAG)
results = client.query_points(
    collection_name="documents",
    query=query_embedding,
    query_filter=Filter(
        must=[
            FieldCondition(key="tenant_id", match=MatchValue(value="acme-corp")),
            FieldCondition(key="doc_type", match=MatchValue(value="contract")),
        ],
        should=[
            FieldCondition(key="created_at", range=Range(gte="2024-01-01")),
        ],
    ),
    limit=10,
    with_payload=True,
)

pgvector — PostgreSQL Extension

-- Enable extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create table with vector column
CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content     TEXT NOT NULL,
    embedding   VECTOR(1536),
    metadata    JSONB DEFAULT '{}',
    tenant_id   TEXT NOT NULL,
    created_at  TIMESTAMPTZ DEFAULT NOW()
);

-- Create HNSW index (faster queries, more memory)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Create IVFFlat index (less memory, slower build)
-- CREATE INDEX ON documents
-- USING ivfflat (embedding vector_cosine_ops)
-- WITH (lists = 100);

-- Semantic search with metadata filtering
SELECT id, content, metadata,
       1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE tenant_id = 'acme-corp'
  AND metadata->>'doc_type' = 'contract'
ORDER BY embedding <=> $1::vector
LIMIT 10;

# Deploy pgvector via Docker
docker run -d \
  --name pgvector \
  -e POSTGRES_PASSWORD=secret \
  -e POSTGRES_DB=vectordb \
  -p 5432:5432 \
  -v pgvector-data:/var/lib/postgresql/data \
  pgvector/pgvector:pg16

Weaviate Deployment

# docker-compose for Weaviate
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "false"
      AUTHENTICATION_APIKEY_ENABLED: "true"
      AUTHENTICATION_APIKEY_ALLOWED_KEYS: "${WEAVIATE_API_KEY}"
      AUTHENTICATION_APIKEY_USERS: "admin"
      PERSISTENCE_DATA_PATH: /var/lib/weaviate
      ENABLE_MODULES: text2vec-openai,generative-openai
      OPENAI_APIKEY: "${OPENAI_API_KEY}"
      CLUSTER_HOSTNAME: node1
    volumes:
      - weaviate-data:/var/lib/weaviate
    restart: unless-stopped

volumes:
  weaviate-data:

Backup and Restore

# Qdrant — snapshot backup
curl -X POST "http://localhost:6333/collections/documents/snapshots"
# Download snapshot
curl -O "http://localhost:6333/collections/documents/snapshots/documents-snapshot.snapshot"
# Restore
curl -X POST "http://localhost:6333/collections/documents/snapshots/recover" \
  -H "Content-Type: application/json" \
  -d '{"location": "/qdrant/snapshots/documents-snapshot.snapshot"}'

# pgvector — standard pg_dump
pg_dump -h localhost -U postgres -d vectordb \
  --table=documents --format=custom > documents-backup.dump

# Restore
pg_restore -h localhost -U postgres -d vectordb documents-backup.dump

Performance Tuning

# Qdrant — optimize collection after bulk load
client.update_collection(
    collection_name="documents",
    optimizer_config={"indexing_threshold": 0},  # force indexing now
)

# Wait for optimization to complete
import time
while True:
    info = client.get_collection("documents")
    if info.status.value == "green":
        break
    time.sleep(5)
    print(f"Optimizing... segments: {info.segments_count}")

Common Issues

Issue	Cause	Fix
Slow queries	No HNSW index built yet	Wait for indexing; check `status == green`
High RAM usage	Vectors in memory	Enable `on_disk=True` for vectors
Poor recall	Low `ef` search param	Increase `ef` in search request (at query time)
pgvector slow	Using IVFFlat without vacuum	Run `VACUUM ANALYZE documents`
Weaviate OOM	Too many objects	Enable async indexing; increase heap

Best Practices

Use cosine distance for normalized embeddings; dot product for unnormalized.
Always create payload indexes on filter fields (tenant_id, doc_type).
For datasets >10M vectors, use on_disk vectors + always_ram quantization.
Benchmark with your actual query patterns before choosing IVFFlat vs HNSW.
Snapshot before any bulk delete or migration operation.

Related Skills

rag-infrastructure - Full RAG pipeline
databases - General database management
postgresql - pgvector host database ops