openai-assistants

Name: openai-assistants
Author: jezweb

Complete guide for OpenAI's Assistants API v2: stateful conversational AI with built-in tools (Code Interpreter, File Search, Function Calling), vector stores for RAG (up to 10,000 files), thread/run lifecycle management, and streaming patterns. Both Node.js SDK and fetch approaches. ⚠️ DEPRECATION NOTICE: OpenAI plans to sunset Assistants API in H1 2026 in favor of Responses API. This skill remains valuable for existing apps and migration planning. Use when: building stateful chatbots with OpenAI, implementing RAG with vector stores, executing Python code with Code Interpreter, using file search for document Q&A, managing conversation threads, streaming assistant responses, or encountering errors like "thread already has active run", vector store indexing delays, run polling timeouts, or file upload issues. Keywords: openai assistants, assistants api, openai threads, openai runs, code interpreter assistant, file search openai, vector store openai, openai rag, assistant streaming, thread persistence, stateful chatbot, thread already has active run, run status polling, vector store error

星标

分支

更新时间

2025年10月29日 12:54

jezweb

jezweb/claude-skills

在 GitHub 查看

下载技能文件

下载包含 SKILL.md 和所有相关文件的完整技能目录

相关技能

openai-api

jezweb

Complete guide for OpenAI's traditional/stateless APIs: Chat Completions (GPT-5, GPT-4o), Embeddings, Images (DALL-E 3), Audio (Whisper + TTS), and Moderation. Includes both Node.js SDK and fetch-based approaches for maximum compatibility. Use when: integrating OpenAI APIs, implementing chat completions with GPT-5/GPT-4o, generating text with streaming, using function calling/tools, creating structured outputs with JSON schemas, implementing embeddings for RAG, generating images with DALL-E 3, transcribing audio with Whisper, synthesizing speech with TTS, moderating content, deploying to Cloudflare Workers, or encountering errors like rate limits (429), invalid API keys (401), function calling failures, streaming parse errors, embeddings dimension mismatches, or token limit exceeded. Keywords: openai api, chat completions, gpt-5, gpt-5-mini, gpt-5-nano, gpt-4o, gpt-4-turbo, openai sdk, openai streaming, function calling, structured output, json schema, openai embeddings, text-embedding-3, dall-e-3, image generation, whisper api, openai tts, text-to-speech, moderation api, openai fetch, cloudflare workers openai, openai rate limit, openai 429, reasoning_effort, verbosity

14•ai-ml

google-gemini-embeddings

jezweb

This skill provides complete coverage of Google Gemini embeddings API (gemini-embedding-001) for building RAG systems, semantic search, document clustering, and similarity matching. Use when implementing vector search with Google's embedding models, integrating with Cloudflare Vectorize, or building retrieval-augmented generation systems. Covers SDK usage (@google/genai), fetch-based Workers implementation, batch processing, 8 task types (RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY, etc.), dimension optimization (128-3072), and cosine similarity calculations. Prevents 8+ embedding-specific errors including dimension mismatches, incorrect task types, rate limiting issues (100 RPM free tier), vector normalization mistakes, text truncation (2,048 token limit), and model version confusion. Includes production-ready RAG patterns with Cloudflare Vectorize integration, chunking strategies, and caching patterns. Token savings: ~60%. Production tested. Keywords: gemini embeddings, gemini-embedding-001, google embeddings, semantic search, RAG, vector search, document clustering, similarity search, retrieval augmented generation, vectorize integration, cloudflare vectorize embeddings, 768 dimensions, embed content gemini, batch embeddings, embeddings api, cosine similarity, vector normalization, retrieval query, retrieval document, task types, dimension mismatch, embeddings rate limit, text truncation, @google/genai

14•ai-ml

cloudflare-workers-ai

jezweb

Complete knowledge domain for Cloudflare Workers AI - Run AI models on serverless GPUs across Cloudflare's global network. Use when: implementing AI inference on Workers, running LLM models, generating text/images with AI, configuring Workers AI bindings, implementing AI streaming, using AI Gateway, integrating with embeddings/RAG systems, or encountering "AI_ERROR", rate limit errors, model not found, token limit exceeded, or neurons exceeded errors. Keywords: workers ai, cloudflare ai, ai bindings, llm workers, @cf/meta/llama, workers ai models, ai inference, cloudflare llm, ai streaming, text generation ai, ai embeddings, image generation ai, workers ai rag, ai gateway, llama workers, flux image generation, stable diffusion workers, vision models ai, ai chat completion, AI_ERROR, rate limit ai, model not found, token limit exceeded, neurons exceeded, ai quota exceeded, streaming failed, model unavailable, workers ai hono, ai gateway workers, vercel ai sdk workers, openai compatible workers, workers ai vectorize

14•ai-ml

cloudflare-vectorize

jezweb

Complete knowledge domain for Cloudflare Vectorize - globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications. Use when: creating vector indexes, inserting embeddings, querying vectors, implementing semantic search, building RAG systems, configuring metadata filtering, working with Workers AI embeddings, integrating with OpenAI embeddings, or encountering metadata index timing errors, dimension mismatches, filter syntax issues, or insert vs upsert confusion. Keywords: vectorize, vector database, vector index, vector search, similarity search, semantic search, nearest neighbor, knn search, ann search, RAG, retrieval augmented generation, chat with data, document search, semantic Q&A, context retrieval, bge-base, @cf/baai/bge-base-en-v1.5, text-embedding-3-small, text-embedding-3-large, Workers AI embeddings, openai embeddings, insert vectors, upsert vectors, query vectors, delete vectors, metadata filtering, namespace filtering, topK search, cosine similarity, euclidean distance, dot product, wrangler vectorize, metadata index, create vectorize index, vectorize dimensions, vectorize metric, vectorize binding

14•ai-ml

openai-responses

jezweb

This skill provides comprehensive knowledge for working with OpenAI's Responses API, the unified stateful API for building agentic applications. It should be used when building AI agents that preserve reasoning across turns, integrating MCP servers for external tools, using built-in tools (Code Interpreter, File Search, Web Search, Image Generation), managing stateful conversations, implementing background processing, or migrating from Chat Completions API. Use when building agentic workflows, conversational AI with memory, tools-based applications, RAG systems, data analysis agents, or any application requiring OpenAI's reasoning models with persistent state. Covers both Node.js SDK and Cloudflare Workers implementations. Keywords: responses api, openai responses, stateful openai, openai mcp, code interpreter openai, file search openai, web search openai, image generation openai, reasoning preservation, agentic workflows, conversation state, background mode, chat completions migration, gpt-5, polymorphic outputs

14•ai-ml

flow-nexus-neural

proffesor-for-testing

Train and deploy neural networks in distributed E2B sandboxes with Flow Nexus

20•ai-ml

AgentDB Vector Search

proffesor-for-testing

Implement semantic vector search with AgentDB for intelligent document retrieval, similarity matching, and context-aware querying. Use when building RAG systems, semantic search engines, or intelligent knowledge bases.

20•ai-ml

context-optimizer

anton-abyzov

Second-pass context optimization that analyzes user prompts and removes irrelevant specs, agents, and skills from loaded context. Achieves 80%+ token reduction through smart cleanup. Activates for optimize context, reduce tokens, clean context, smart context, precision loading.

3•ai-ml