with one click
mongodb-ai-features
Add AI capabilities to a MongoDB app including LLM summarization, structured generation, RAG pipeline with Atlas Vector Search, Voyage AI embeddings, and usage tracking with cost estimation
Menu
Add AI capabilities to a MongoDB app including LLM summarization, structured generation, RAG pipeline with Atlas Vector Search, Voyage AI embeddings, and usage tracking with cost estimation
| name | mongodb-ai-features |
| description | Add AI capabilities to a MongoDB app including LLM summarization, structured generation, RAG pipeline with Atlas Vector Search, Voyage AI embeddings, and usage tracking with cost estimation |
| license | MIT |
| metadata | {"version":"1.0.0","author":"Michael Lynn [mlynn.org](https://mlynn.org)","category":"mongodb-devrel","domain":"artificial-intelligence","updated":"2026-03-01T00:00:00.000Z","python-tools":"embedding_cost_estimator.py, chunk_size_analyzer.py","tech-stack":"openai, voyage-ai, atlas-vector-search, mongoose"} |
Use this skill when adding AI capabilities to a MongoDB-backed app: LLM summarization, structured generation, RAG pipeline (ingest/embed/retrieve/chat), or AI usage tracking with cost estimation.
RAG with MongoDB Atlas Vector Search is my most-requested demo. This skill captures the full pipeline — chunking, embedding, retrieval, chat — so I'm not rebuilding it every time. — ML
Every DevRel demo and sample app includes AI features. This skill provides tested, MongoDB-native AI integration patterns — especially RAG with Atlas Vector Search. It includes a provider-agnostic usage logger, singleton client patterns, and a complete ingest-embed-retrieve-chat pipeline using Voyage AI for embeddings and OpenAI for generation.
Invoke with /mongodb-ai-features or let Claude auto-activate when adding AI/RAG capabilities.
scripts/embedding_cost_estimator.py — Estimate Voyage/OpenAI costs for doc count + query volumescripts/chunk_size_analyzer.py — Analyze markdown files, recommend optimal chunk sizesreferences/model-comparison.md — Voyage vs OpenAI embedding models: cost, dimensions, qualityreferences/vector-search-setup.md — Step-by-step Atlas Vector Search index creationassets/vector-search-index.json — Copy-paste Atlas vector search index definitionassets/sample_rag_document.json — Example RagDocument with embeddingvoyage-4-large produces higher-quality embeddings for retrieval. OpenAI handles chat/generation. This is a deliberate two-provider strategy.src/lib/
├── ai/
│ ├── usage-logger.ts # Fire-and-forget logging with cost estimation
│ ├── summary-service.ts # LLM summarization
│ ├── feedback-service.ts # Multi-source feedback synthesis
│ ├── project-suggestion.ts # Structured idea generation
│ └── embedding-service.ts # OpenAI embeddings (for non-RAG use cases)
├── rag/
│ ├── types.ts # IRagDocument, IRagIngestionRun, ChatMessage interfaces
│ ├── embeddings.ts # Voyage AI embeddings (document + query)
│ ├── ingestion.ts # Markdown → chunks → embeddings pipeline
│ ├── chunker.ts # Document parsing and chunking
│ ├── retrieval.ts # $vectorSearch + category boosting
│ ├── chat.ts # Streaming chat with context injection
│ └── rate-limit.ts # Request throttling
└── db/models/
├── AiUsageLog.ts # Usage tracking model
├── RagDocument.ts # Document + embedding storage
├── RagIngestionRun.ts # Ingestion run tracking
└── RagConversation.ts # Chat session history
// src/lib/ai/usage-logger.ts
import { AiUsageLogModel } from "@/lib/db/models/AiUsageLog";
import { connectToDatabase } from "@/lib/db/connection";
const MODEL_COST_PER_MILLION: Record<string, number> = {
"gpt-4o": 7.5,
"gpt-4-turbo": 20,
"text-embedding-3-small": 0.02,
"voyage-4-large": 0.12,
"voyage-4": 0.1,
"voyage-4-lite": 0.05,
};
function estimateCost(model: string, tokens: number): number {
const rate = MODEL_COST_PER_MILLION[model] ?? 5;
return (tokens / 1_000_000) * rate;
}
interface LogAiUsageParams {
category: string; // "project_summaries" | "judge_feedback" | "rag_chat" | "rag_embeddings" | etc.
provider: string; // "openai" | "voyage"
model: string;
operation: string; // "chat_completion" | "embedding" | "streaming"
tokensUsed: number;
promptTokens?: number;
completionTokens?: number;
durationMs: number;
userId?: string;
eventId?: string;
metadata?: Record<string, unknown>;
error?: boolean;
}
/**
* Fire-and-forget. NEVER throws.
*/
export function logAiUsage(params: LogAiUsageParams): void {
const cost = estimateCost(params.model, params.tokensUsed);
connectToDatabase()
.then(() => AiUsageLogModel.create({
...params,
estimatedCost: cost,
error: params.error ?? false,
}))
.catch((err) => console.error("[AI Usage Logger] Failed:", err.message));
}
// src/lib/ai/summary-service.ts
import OpenAI from "openai";
import { logAiUsage } from "./usage-logger";
let openai: OpenAI | null = null;
function getOpenAIClient(): OpenAI {
if (!openai) {
openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
}
return openai;
}
export async function generateProjectSummary(project: {
name: string; description: string; technologies: string[]; innovations?: string;
}): Promise<string> {
const client = getOpenAIClient();
const startTime = Date.now();
const response = await client.chat.completions.create({
model: "gpt-4-turbo",
messages: [
{
role: "system",
content: "You are summarizing hackathon projects for judges. Write exactly 2-3 sentences. Focus on what the project does, key technology, and what makes it novel.",
},
{
role: "user",
content: `Project: ${project.name}\nDescription: ${project.description}\nTech: ${project.technologies.join(", ")}${project.innovations ? `\nInnovations: ${project.innovations}` : ""}`,
},
],
max_tokens: 150,
temperature: 0.6,
});
logAiUsage({
category: "project_summaries", provider: "openai", model: response.model,
operation: "chat_completion", tokensUsed: response.usage?.total_tokens || 0,
promptTokens: response.usage?.prompt_tokens, completionTokens: response.usage?.completion_tokens,
durationMs: Date.now() - startTime,
});
return response.choices[0].message.content?.trim() || "";
}
// src/lib/rag/embeddings.ts
import { logAiUsage } from "@/lib/ai/usage-logger";
const VOYAGE_API_URL = "https://api.voyageai.com/v1/embeddings";
const VOYAGE_BATCH_SIZE = 128;
const DOCUMENT_MODEL = "voyage-4-large";
const QUERY_MODEL = "voyage-4-large"; // Can downgrade to voyage-4 or voyage-4-lite (same embedding space)
async function callVoyageAPI(texts: string[], inputType: "document" | "query", model: string) {
const response = await fetch(VOYAGE_API_URL, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${process.env.VOYAGE_API_KEY}`,
},
body: JSON.stringify({ input: texts, model, input_type: inputType }),
});
if (!response.ok) throw new Error(`Voyage API error (${response.status}): ${await response.text()}`);
return response.json();
}
/** Embed document chunks for storage. Handles batching automatically. */
export async function embedDocuments(texts: string[]): Promise<{ embeddings: number[][]; totalTokens: number }> {
if (texts.length === 0) return { embeddings: [], totalTokens: 0 };
const allEmbeddings: number[][] = [];
let totalTokens = 0;
const startTime = Date.now();
for (let i = 0; i < texts.length; i += VOYAGE_BATCH_SIZE) {
const batch = texts.slice(i, i + VOYAGE_BATCH_SIZE);
const response = await callVoyageAPI(batch, "document", DOCUMENT_MODEL);
const sorted = response.data.sort((a: { index: number }, b: { index: number }) => a.index - b.index);
allEmbeddings.push(...sorted.map((d: { embedding: number[] }) => d.embedding));
totalTokens += response.usage.total_tokens;
}
logAiUsage({
category: "rag_embeddings", provider: "voyage", model: DOCUMENT_MODEL,
operation: "embedding", tokensUsed: totalTokens, durationMs: Date.now() - startTime,
metadata: { batchSize: texts.length, inputType: "document" },
});
return { embeddings: allEmbeddings, totalTokens };
}
/** Embed a user query for search. */
export async function embedQuery(text: string): Promise<number[]> {
const startTime = Date.now();
const response = await callVoyageAPI([text], "query", QUERY_MODEL);
logAiUsage({
category: "rag_embeddings", provider: "voyage", model: QUERY_MODEL,
operation: "embedding", tokensUsed: response.usage.total_tokens,
durationMs: Date.now() - startTime, metadata: { inputType: "query" },
});
return response.data[0].embedding;
}
// src/lib/rag/retrieval.ts
import { connectToDatabase } from "@/lib/db/connection";
import { RagDocumentModel } from "@/lib/db/models/RagDocument";
import { embedQuery } from "./embeddings";
const CATEGORY_BOOST: Record<string, number> = {
events: 1.6, admin: 1.5, "getting-started": 1.4,
features: 1.2, ai: 1.1, docs: 1.0, api: 0.3,
};
export async function retrieveContext(
query: string,
options: { isAuthenticated: boolean; topK?: number; scoreThreshold?: number }
) {
await connectToDatabase();
const topK = options.topK ?? 5;
const queryEmbedding = await embedQuery(query);
const filter: Record<string, unknown> = {};
if (!options.isAuthenticated) filter["accessLevel"] = "public";
// $vectorSearch is an Atlas-specific aggregation stage
const pipeline = [
{
$vectorSearch: {
index: "rag_document_vector",
path: "embedding",
queryVector: queryEmbedding,
numCandidates: topK * 30,
limit: topK * 3,
...(Object.keys(filter).length > 0 ? { filter } : {}),
},
},
{ $project: { content: 1, source: 1, score: { $meta: "vectorSearchScore" } } },
];
const results = await RagDocumentModel.aggregate(pipeline);
// Apply category boosting and re-sort
const boosted = results.map((chunk) => ({
...chunk,
score: chunk.score * (CATEGORY_BOOST[chunk.source.category.toLowerCase()] ?? 1.0),
}));
boosted.sort((a, b) => b.score - a.score);
const topResults = boosted.slice(0, topK).filter((r) => r.score >= (options.scoreThreshold ?? 0.7));
return {
content: topResults.map((c, i) => `[Source ${i + 1}: ${c.source.title}]\n${c.content}`).join("\n\n---\n\n"),
sources: topResults.map((c) => ({ title: c.source.title, url: c.source.url, section: c.source.section, relevanceScore: c.score })),
};
}
// src/lib/rag/types.ts
import { Types } from "mongoose";
export interface IRagDocument {
content: string;
contentHash: string;
accessLevel: "public" | "authenticated";
source: { filePath: string; title: string; section: string; category: string; url: string; type: "docs" | "event" | "project" | "platform" };
chunk: { index: number; totalChunks: number; tokens: number };
embedding: number[];
ingestion: { runId: string; ingestedAt: Date; ingestedBy: Types.ObjectId; version: number };
}
export interface IRagIngestionRun {
runId: string;
status: "running" | "completed" | "failed" | "cancelled";
stats: {
filesProcessed: number; filesSkipped: number;
chunksCreated: number; chunksDeleted: number;
embeddingsGenerated: number; totalTokens: number;
errors: Array<{ file: string; error: string }>;
};
startedAt: Date; completedAt: Date | null; durationMs: number | null;
triggeredBy: Types.ObjectId;
}
export interface ChatMessage {
role: "user" | "assistant";
content: string;
sources?: { title: string; url: string; section: string; relevanceScore: number }[];
feedback?: "up" | "down";
createdAt: Date;
}
export interface IRagConversation {
sessionId: string;
userId?: Types.ObjectId;
messages: ChatMessage[];
metadata: { page: string; userAgent: string };
}
export type VoyageInputType = "document" | "query";
export interface EmbeddingResult { embeddings: number[][]; totalTokens: number }
export interface IngestionOptions { forceReindex?: boolean; docsPath?: string; triggeredBy: Types.ObjectId }
Create this index on the ragdocuments collection in Atlas:
{
"name": "rag_document_vector",
"type": "vectorSearch",
"definition": {
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1024,
"similarity": "cosine"
},
{
"type": "filter",
"path": "accessLevel"
},
{
"type": "filter",
"path": "source.category"
}
]
}
}
OPENAI_API_KEY=sk-...
VOYAGE_API_KEY=pa-...
npm install openai
input_type: "document" for ingestion and input_type: "query" for search. Mixing these degrades retrieval quality..catch().$vectorSearch fails silently with zero results if the index doesn't exist.contentHash to avoid re-embedding unchanged documents.numCandidates: topK * 30 and limit: topK * 3 to get enough candidates for category boosting to be effective.select: false if you need them for search. But DO exclude them from regular queries with select('-embedding') to avoid transferring large vectors.Generate conference talk proposals (CFPs), abstracts, and presentation outlines with slide structure and timing guidance
Generate workshop agendas and hands-on curriculum for customer developer days, technical training sessions, and field engagements
MongoDB schema design advisor focusing on embed vs reference decisions, relationship modeling, and performance optimization patterns
Generate scoring rubrics and constructive feedback for hackathon submissions with fair evaluation frameworks and actionable improvement suggestions
Generate Model Context Protocol (MCP) tool servers from API descriptions, enabling AI assistants to connect to external services
Self-service Atlas cluster provisioning with HTTP Digest auth, Admin API v2 client, 9-step orchestration with rollback, status polling, and DevRel attribution tracking