| name | vector-database-ops |
| description | Deploy, manage, and optimize vector databases for AI applications. Covers Qdrant, Weaviate, pgvector, and Pinecone — collection management, indexing strategies, backup, and performance tuning for production RAG and semantic search workloads. |
| license | MIT |
| metadata | {"author":"devops-skills","version":"1.0"} |
Vector Database Operations
Run production vector databases for AI-powered search, RAG, and recommendation systems.
When to Use This Skill
Use this skill when:
- Setting up a vector database for a RAG or semantic search application
- Choosing between Qdrant, Weaviate, pgvector, or Pinecone
- Managing collections, indexes, and data migrations
- Optimizing query performance and indexing for production loads
- Implementing multi-tenant vector search with namespace isolation
Vector Database Comparison
| Database | Best For | Hosting | Filtering | Scale |
|---|
| Qdrant | High-performance, rich filtering, self-hosted | Self / Cloud | Excellent | Very High |
| Weaviate | Schema-first, hybrid search, multi-modal | Self / Cloud | Good | High |
| pgvector | Already on Postgres, simple use cases | Self | Good | Medium |
| Pinecone | Zero-ops managed, serverless | Managed only | Good | Very High |
| Chroma | Local dev, prototyping | Self only | Basic | Low-Medium |
Qdrant — Production Deployment
docker run -d \
--name qdrant \
-p 6333:6333 \
-p 6334:6334 \
-v $(pwd)/qdrant-data:/qdrant/storage \
qdrant/qdrant:latest
docker run -d \
--name qdrant \
-p 6333:6333 \
-v $(pwd)/qdrant-data:/qdrant/storage \
-v $(pwd)/qdrant-config.yaml:/qdrant/config/production.yaml \
qdrant/qdrant:latest
storage:
storage_path: /qdrant/storage
on_disk_payload: true
service:
max_request_size_mb: 32
hnsw_index:
m: 16
ef_construct: 100
full_scan_threshold: 10000
quantization:
scalar:
type: int8
quantile: 0.99
always_ram: true
telemetry_disabled: true
Qdrant Collection Management
from qdrant_client import QdrantClient
from qdrant_client.models import (
Distance, VectorParams, HnswConfigDiff,
ScalarQuantizationConfig, ScalarType, QuantizationConfig
)
client = QdrantClient("http://localhost:6333")
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(
size=1536,
distance=Distance.COSINE,
on_disk=True,
),
hnsw_config=HnswConfigDiff(
m=32,
ef_construct=200,
on_disk=False,
),
quantization_config=QuantizationConfig(
scalar=ScalarQuantizationConfig(
type=ScalarType.INT8,
quantile=0.99,
always_ram=True,
)
),
)
client.create_payload_index(
collection_name="documents",
field_name="tenant_id",
field_schema="keyword",
)
client.create_payload_index(
collection_name="documents",
field_name="created_at",
field_schema="datetime",
)
info = client.get_collection("documents")
print(f"Vectors: {info.vectors_count}, Status: {info.status}")
Qdrant Filtered Search
from qdrant_client.models import Filter, FieldCondition, MatchValue, Range
results = client.query_points(
collection_name="documents",
query=query_embedding,
query_filter=Filter(
must=[
FieldCondition(key="tenant_id", match=MatchValue(value="acme-corp")),
FieldCondition(key="doc_type", match=MatchValue(value="contract")),
],
should=[
FieldCondition(key="created_at", range=Range(gte="2024-01-01")),
],
),
limit=10,
with_payload=True,
)
pgvector — PostgreSQL Extension
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
embedding VECTOR(1536),
metadata JSONB DEFAULT '{}',
tenant_id TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
SELECT id, content, metadata,
1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE tenant_id = 'acme-corp'
AND metadata->>'doc_type' = 'contract'
ORDER BY embedding <=> $1::vector
LIMIT 10;
docker run -d \
--name pgvector \
-e POSTGRES_PASSWORD=secret \
-e POSTGRES_DB=vectordb \
-p 5432:5432 \
-v pgvector-data:/var/lib/postgresql/data \
pgvector/pgvector:pg16
Weaviate Deployment
services:
weaviate:
image: semitechnologies/weaviate:latest
ports:
- "8080:8080"
- "50051:50051"
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "false"
AUTHENTICATION_APIKEY_ENABLED: "true"
AUTHENTICATION_APIKEY_ALLOWED_KEYS: "${WEAVIATE_API_KEY}"
AUTHENTICATION_APIKEY_USERS: "admin"
PERSISTENCE_DATA_PATH: /var/lib/weaviate
ENABLE_MODULES: text2vec-openai,generative-openai
OPENAI_APIKEY: "${OPENAI_API_KEY}"
CLUSTER_HOSTNAME: node1
volumes:
- weaviate-data:/var/lib/weaviate
restart: unless-stopped
volumes:
weaviate-data:
Backup and Restore
curl -X POST "http://localhost:6333/collections/documents/snapshots"
curl -O "http://localhost:6333/collections/documents/snapshots/documents-snapshot.snapshot"
curl -X POST "http://localhost:6333/collections/documents/snapshots/recover" \
-H "Content-Type: application/json" \
-d '{"location": "/qdrant/snapshots/documents-snapshot.snapshot"}'
pg_dump -h localhost -U postgres -d vectordb \
--table=documents --format=custom > documents-backup.dump
pg_restore -h localhost -U postgres -d vectordb documents-backup.dump
Performance Tuning
client.update_collection(
collection_name="documents",
optimizer_config={"indexing_threshold": 0},
)
import time
while True:
info = client.get_collection("documents")
if info.status.value == "green":
break
time.sleep(5)
print(f"Optimizing... segments: {info.segments_count}")
Common Issues
| Issue | Cause | Fix |
|---|
| Slow queries | No HNSW index built yet | Wait for indexing; check status == green |
| High RAM usage | Vectors in memory | Enable on_disk=True for vectors |
| Poor recall | Low ef search param | Increase ef in search request (at query time) |
| pgvector slow | Using IVFFlat without vacuum | Run VACUUM ANALYZE documents |
| Weaviate OOM | Too many objects | Enable async indexing; increase heap |
Best Practices
- Use cosine distance for normalized embeddings; dot product for unnormalized.
- Always create payload indexes on filter fields (
tenant_id, doc_type).
- For datasets >10M vectors, use
on_disk vectors + always_ram quantization.
- Benchmark with your actual query patterns before choosing IVFFlat vs HNSW.
- Snapshot before any bulk delete or migration operation.
Related Skills