Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

vectorize-search

Étoiles20

Forks4

Mis à jour7 février 2026 à 18:32

Vectorize semantic search patterns for agent memory. Use when implementing embedding generation, semantic search, memory retrieval, or working with the Vectorize API. Triggers on Vectorize, embeddings, semantic search, vector database, similarity search.

Installation

Installer avec Codex ou Claude Copiez ce prompt, collez-le dans Codex, Claude ou un autre assistant, puis laissez-le vérifier la page du skill et l'installer pour vous.

Exécuter dans Manus

Source

joelhooks

joelhooks/atproto-agent-network

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Téléchargement

Exécuter dans Manus

Métiers associésSOC

Basé sur la classification professionnelle SOC

Scientifiques des donnéesProfessions informatiques et mathématiques·SOC 15-2051

SKILL.md

readonly

Plus depuis ce dépôt

même dépôt

ios-app

joelhooks/atproto-agent-network

Build and maintain the native iOS client inside this turborepo (XcodeGen, SwiftUI, WebSocket feed). Use when creating/updating the iOS app, wiring turbo tasks, or documenting iOS workflows.

2026-02-1220

ios-publish

joelhooks/atproto-agent-network

Publish the native iOS app to TestFlight from this repo. Use when archiving/exporting/uploading builds, bumping versions, checking App Store Connect processing status, or fixing common upload failures.

2026-02-1220

atproto-deploy

joelhooks/atproto-agent-network

Deploy the atproto-agent-network project. Two separate Cloudflare Workers must be deployed independently — the network worker (API + agents) and the dashboard worker (highswarm.com). Use when deploying changes, fixing the dashboard, or updating agent configs.

2026-02-0920

cloudflare-do

joelhooks/atproto-agent-network

Cloudflare Durable Objects patterns for agent state. Use when implementing agent DOs, WebSocket handling, hibernation, storage API, alarms, or DO-to-DO communication. Triggers on Durable Object, DO state, WebSocket server, hibernation, agent persistence.

2026-02-0820

cloudflare-deploy

joelhooks/atproto-agent-network

Deploy applications and infrastructure to Cloudflare using Workers, Pages, and related platform services. Use when the user asks to deploy, host, publish, or set up a project on Cloudflare.

2026-02-0720

pi-agent

joelhooks/atproto-agent-network

Pi agent runtime integration for Cloudflare. Use when implementing the agent loop, tools, extensions, session trees, or multi-provider LLM calls. Triggers on Pi, agent runtime, tool calling, streaming, session tree, extensions, self-extending agent.

2026-02-0720

name	vectorize-search
description	Vectorize semantic search patterns for agent memory. Use when implementing embedding generation, semantic search, memory retrieval, or working with the Vectorize API. Triggers on Vectorize, embeddings, semantic search, vector database, similarity search.

Vectorize Semantic Search

Vectorize provides semantic search over encrypted memories. Index the plaintext (before encryption), store embeddings with record IDs.

Security Model

┌─────────────────────────────────────────────────────────────────┐
│                      SEARCH FLOW                                 │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  1. STORE                                                        │
│     Content → Embed (plaintext) → Store vector + record ID       │
│     Content → Encrypt → Store ciphertext in D1                   │
│                                                                  │
│  2. SEARCH                                                       │
│     Query → Embed → Vectorize search → Record IDs                │
│     Record IDs → Fetch from D1 → Decrypt → Return plaintext      │
│                                                                  │
│  ⚠️  Vectorize stores embeddings, NOT content                    │
│  ⚠️  Embeddings can leak semantic information                    │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Index Configuration

# wrangler.toml
[[vectorize]]
binding = "VECTORIZE"
index_name = "agent-memory"

Create index via CLI:

wrangler vectorize create agent-memory \
  --dimensions=768 \
  --metric=cosine

Dimension depends on embedding model:

@cf/baai/bge-base-en-v1.5: 768
@cf/baai/bge-large-en-v1.5: 1024
OpenAI text-embedding-3-small: 1536

Generate Embeddings

Using Cloudflare AI:

async function embed(
  ai: Ai,
  text: string
): Promise<number[]> {
  const result = await ai.run('@cf/baai/bge-base-en-v1.5', {
    text: [text]
  })
  return result.data[0]
}

async function embedBatch(
  ai: Ai,
  texts: string[]
): Promise<number[][]> {
  const result = await ai.run('@cf/baai/bge-base-en-v1.5', {
    text: texts
  })
  return result.data
}

Index Memory

async function indexMemory(
  vectorize: VectorizeIndex,
  ai: Ai,
  record: {
    id: string
    did: string
    collection: string
    content: MemoryContent  // Plaintext before encryption
  }
): Promise<void> {
  // Create searchable text from content
  const text = extractSearchableText(record.content)
  
  // Generate embedding
  const embedding = await embed(ai, text)
  
  // Upsert to Vectorize
  await vectorize.upsert([{
    id: record.id,
    values: embedding,
    metadata: {
      did: record.did,
      collection: record.collection,
      tags: record.content.tags?.join(','),
      createdAt: record.content.createdAt
    }
  }])
}

function extractSearchableText(content: MemoryContent): string {
  const parts: string[] = []
  
  if (content.summary) parts.push(content.summary)
  if (content.text) parts.push(content.text)
  if (content.tags) parts.push(content.tags.join(' '))
  
  return parts.join('\n')
}

Semantic Search

interface SearchOptions {
  did?: string           // Filter by agent
  collection?: string    // Filter by collection
  limit?: number         // Max results (default 10)
  minScore?: number      // Minimum similarity (default 0.5)
}

async function searchMemory(
  vectorize: VectorizeIndex,
  ai: Ai,
  query: string,
  options: SearchOptions = {}
): Promise<VectorizeMatch[]> {
  // Embed query
  const queryEmbedding = await embed(ai, query)
  
  // Build filter
  const filter: VectorizeFilter = {}
  if (options.did) filter.did = options.did
  if (options.collection) filter.collection = options.collection
  
  // Search
  const results = await vectorize.query(queryEmbedding, {
    topK: options.limit || 10,
    filter,
    returnMetadata: true
  })
  
  // Filter by score
  const minScore = options.minScore ?? 0.5
  return results.matches.filter(m => m.score >= minScore)
}

Full Search Flow

async function recallMemories(
  env: Env,
  identity: AgentIdentity,
  query: string,
  options: SearchOptions = {}
): Promise<DecryptedMemory[]> {
  // 1. Semantic search to get record IDs
  const matches = await searchMemory(
    env.VECTORIZE,
    env.AI,
    query,
    { ...options, did: identity.did }
  )
  
  if (matches.length === 0) return []
  
  // 2. Fetch encrypted records from D1
  const ids = matches.map(m => m.id)
  const placeholders = ids.map(() => '?').join(',')
  const rows = await env.DB.prepare(
    `SELECT * FROM records WHERE id IN (${placeholders})`
  ).bind(...ids).all()
  
  // 3. Decrypt each record
  const memories: DecryptedMemory[] = []
  for (const row of rows.results) {
    const record = rowToRecord(row)
    const content = await decryptRecord(record, identity)
    
    const match = matches.find(m => m.id === row.id)
    memories.push({
      id: row.id as string,
      content,
      score: match?.score || 0,
      metadata: match?.metadata
    })
  }
  
  // 4. Sort by score (highest first)
  return memories.sort((a, b) => b.score - a.score)
}

Batch Indexing

async function batchIndex(
  vectorize: VectorizeIndex,
  ai: Ai,
  records: Array<{ id: string; text: string; metadata: Record<string, string> }>
): Promise<void> {
  // Batch embed (max 100 at a time)
  const batchSize = 100
  
  for (let i = 0; i < records.length; i += batchSize) {
    const batch = records.slice(i, i + batchSize)
    const texts = batch.map(r => r.text)
    
    const embeddings = await embedBatch(ai, texts)
    
    const vectors = batch.map((r, j) => ({
      id: r.id,
      values: embeddings[j],
      metadata: r.metadata
    }))
    
    await vectorize.upsert(vectors)
  }
}

Delete from Index

async function deleteFromIndex(
  vectorize: VectorizeIndex,
  ids: string[]
): Promise<void> {
  await vectorize.deleteByIds(ids)
}

Metadata Filtering

Vectorize supports filtering on metadata:

// Filter by multiple conditions
const results = await vectorize.query(embedding, {
  topK: 10,
  filter: {
    did: 'did:cf:abc123',
    collection: 'agent.memory.note'
  }
})

// Note: Metadata values must be strings
// Store tags as comma-separated: "tag1,tag2,tag3"

Wrangler Configuration

[[vectorize]]
binding = "VECTORIZE"
index_name = "agent-memory"

[ai]
binding = "AI"

vectorize-search

Vectorize Semantic Search

Security Model

Index Configuration

Generate Embeddings

Index Memory

Semantic Search

Full Search Flow

Batch Indexing

Delete from Index

Metadata Filtering

Wrangler Configuration

References

Vectorize Semantic Search

Security Model

Index Configuration

Generate Embeddings

Index Memory

Semantic Search

Full Search Flow

Batch Indexing

Delete from Index

Metadata Filtering

Wrangler Configuration

References