一键在 Manus 中运行任何 Skill

vector-hybrid-search

Guide for building vector search, hybrid search, and using Elasticsearch as a vector database. Covers semantic_text, dense_vector, embedding strategies, hybrid BM25+kNN via RRF, reranking, and production optimization. Use when a developer wants semantic search, hybrid search, kNN, embeddings, or Elasticsearch as a vector store.

在 Manus 中运行

星标21,145

分支8,595

更新时间2026年4月10日 22:19

来源

elastic

elastic/kibana

打开 GitHub 仓库查看创作者相关仓库

安装命令

下载

在 Manus 中运行

相关职业SOC

基于 SOC 职业分类

软件开发工程师计算机与数学类职业·SOC 15-1252

SKILL.md

readonly

同仓库更多 Skills

同仓库

workflows-custom-steps

elastic/kibana

Register and implement custom workflow steps from an external Kibana plugin using `@kbn/workflows-extensions`. Use when adding or modifying a step type with `registerStepDefinition`, designing input/output/config Zod schemas, implementing `createServerStepDefinition` / `createPublicStepDefinition`, choosing `StepCategory`, building `editorHandlers` (selection / dynamicSchema), wiring `callKibanaApi` / `onCancel`, deciding sync vs async loader registration, updating `APPROVED_STEP_DEFINITIONS`, or reviewing PRs that touch any of these.

2026-06-1821.1k

workflows-custom-triggers

elastic/kibana

Register and implement custom workflow triggers from an external Kibana plugin using `@kbn/workflows-extensions`. Use when adding or modifying an event-driven trigger with `registerTriggerDefinition`, designing `eventSchema` Zod schemas, writing `documentation` and KQL `snippets`, wiring `emitEvent` via request context or `getClient`, choosing sync vs async public loader registration, updating `APPROVED_TRIGGER_DEFINITIONS`, or reviewing PRs that touch any of these. Always ask for the user's plugin id first to locate the correct plugin and file paths.

2026-06-0421.1k

workflows-managed-workflows

elastic/kibana

Register and roll out managed workflows from a Kibana plugin using `@kbn/workflows-extensions` and `@kbn/workflows/managed`. Use when adding or modifying a code-owned workflow definition, `registerManagedWorkflowOwner`, `initManagedWorkflowsClient`, `install` / `uninstall` / `ready`, choosing `lifecycle` / `versionStrategy` / `enablement`, authoring `yaml` vs `yamlTemplate`, space-scoped vs global installs, `getWorkflowStatus`, or `execute`, or reviewing PRs that touch managed workflow definitions or rollout. Always ask for the user's plugin id first to locate the correct plugin and definition file paths.

2026-06-0421.1k

kibana-otel-instrumentation

elastic/kibana

Implement and quality-check OpenTelemetry metric instrumentation in Kibana code that uses `@kbn/metrics`. Use whenever the user wants to add, change, or review OTel metrics — including any call to `metrics.getMeter`, `meter.createCounter`/`createUpDownCounter`/`createGauge`/`createHistogram`/`createObservable*`/`addBatchObservableCallback`, edits to `kibana.yml` `telemetry.metrics` config, or questions like "is this metric well-designed?", "what should I name this counter?", or "which instrument type is right here?". Trigger this skill even when the user does not say "OTel" or "OpenTelemetry" but is clearly adding observability to Kibana server code and already knows what they want to measure.

2026-06-0321.1k

elasticsearch-onboarding

elastic/kibana

Primary guided playbook for Elasticsearch search in Kibana Agent Builder: intent → data → mapping → Dev Tools API snippets (SENSE), with one question at a time. Load this skill whenever the user wants to learn Elasticsearch search, get started, begin building, take first steps, onboard, follow a walkthrough or tutorial, go from zero to a working query, or get structured help setting up indices and search — including casual openers like hi, help, getting started, new to Elasticsearch, how do I build search, or I want to try search. Use when they need end-to-end onboarding, not a single narrow API answer. If they only ask what they can build with Elastic (exploration without the full playbook), prefer invoking /use-case-library first; you can still load this skill afterward for the guided build.

2026-06-0221.1k

elasticsearch-tutorial

elastic/kibana

Topic-driven, hands-on Elasticsearch tutorial flow that runs in Kibana Dev Console. Use whenever the user says "walk me through", "give me a tutorial for", "teach me", "show me how X works", "tutorial on", or similar topical learning intent — and they are NOT asking you to build their real, specific use case. Topics are open-ended: any Elasticsearch / Kibana search concept the user names (e.g. mappings, analyzers, bool queries, semantic_text, kNN, RRF, aggregations, ingest pipelines, reranking, data streams, ES|QL). Tutorials use sample data on isolated resources, present every step as a SENSE snippet to run in Dev Tools, and end with cleanup plus pointers to docs and the onboarding / pattern skills.

2026-06-0221.1k

name	vector-hybrid-search
description	Guide for building vector search, hybrid search, and using Elasticsearch as a vector database. Covers semantic_text, dense_vector, embedding strategies, hybrid BM25+kNN via RRF, reranking, and production optimization. Use when a developer wants semantic search, hybrid search, kNN, embeddings, or Elasticsearch as a vector store.

Vector & Hybrid Search Guide

Covers the full lifecycle of vector or hybrid search with Elasticsearch — planning, data modeling, search implementation, and optimization. All API examples use SENSE syntax for Kibana Dev Tools.

Conversation flow — return to onboarding

This skill provides deep implementation detail for vector and hybrid search. It is not the main conversation driver.

After applying the guidance here, re-read /elasticsearch-onboarding to resume the structured onboarding playbook (Steps 1–7: intent → data → mapping → build → test → iterate). That playbook controls sequencing, the one-question-at-a-time rule, and the Dev Tools API-snippet workflow. If /elasticsearch-onboarding has not been loaded yet in this conversation, load it now — it is the primary conversation flow for all Elasticsearch search onboarding.

Decision: Embedding Strategy

Ask these routing questions first:

"Are you already generating embeddings?" → Yes → dense_vector path. Briefly offer semantic_text as a simpler alternative.
"What version of Elasticsearch?" → Below 8.15 → semantic_text unavailable, use dense_vector.

Option	When to Use
Built-in via EIS	Default for Cloud (Serverless or ECH) on 8.15+. No ML node cost. Jina v3 is the current default dense model for `semantic_text`.
Third-party (OpenAI, Cohere)	Existing model contract or specific model requirement.
Self-hosted	Custom fine-tuned models deployed on ML nodes.

Decision: Vector Field Type

Option	Field Type	When to Use
`semantic_text`	`semantic_text`	8.15+, no existing vectors. Default recommendation — auto chunking, auto embedding.
`dense_vector`	`dense_vector`	Bringing your own vectors, need dims/similarity control, or pre-8.15.

`semantic_text` Mapping (Default)

Minimal — works out of the box on Serverless (uses the platform default model, currently Jina):

PUT /my-index
{
  "mappings": {
    "properties": {
      "content": { "type": "semantic_text" },
      "title": { "type": "text" },
      "category": { "type": "keyword" }
    }
  }
}

With a specific inference endpoint:

PUT /my-index
{
  "mappings": {
    "properties": {
      "content": {
        "type": "semantic_text",
        "inference_id": "my-inference-endpoint"
      }
    }
  }
}

`dense_vector` Mapping

PUT /my-index
{
  "mappings": {
    "properties": {
      "content": { "type": "text" },
      "content_embedding": {
        "type": "dense_vector",
        "dims": 1536,
        "index": true,
        "similarity": "cosine"
      },
      "category": { "type": "keyword" }
    }
  }
}

Set dims to match the embedding model output (OpenAI text-embedding-3-small = 1536, Jina v3 = 1024, E5-small = 384).

Before generating inference endpoint config, check EIS docs for current model IDs. Jina v3 is the current default dense model for semantic_text; Jina v5-small is available for cost-sensitive workloads. Model IDs change regularly.

Decision: Search Type

Option	When to Use
Pure kNN	All queries are semantic/meaning-based, no exact term matching needed.
Hybrid (BM25 + kNN via RRF)	Users search with both keywords AND natural language. Default recommendation.
Semantic via `semantic_text`	Using `semantic_text` field — simplest semantic search.

Semantic Search (`semantic_text`)

POST /my-index/_search
{
  "retriever": {
    "standard": {
      "query": {
        "semantic": {
          "field": "content",
          "query": "how do I configure index mappings"
        }
      }
    }
  }
}

Pure kNN (`dense_vector`)

POST /my-index/_search
{
  "retriever": {
    "knn": {
      "field": "content_embedding",
      "query_vector": [0.1, 0.2, 0.3],
      "k": 10,
      "num_candidates": 100
    }
  }
}

Hybrid Search with RRF

POST /my-index/_search
{
  "retriever": {
    "rrf": {
      "retrievers": [
        {
          "standard": {
            "query": {
              "multi_match": {
                "query": "elasticsearch index mapping",
                "fields": ["title^2", "content"]
              }
            }
          }
        },
        {
          "knn": {
            "field": "content_embedding",
            "query_vector": [0.1, 0.2, 0.3],
            "k": 50,
            "num_candidates": 100
          }
        }
      ],
      "window_size": 100,
      "rank_constant": 60
    }
  }
}

Add filter clauses to both retrievers for filtered hybrid search.

Reranking

Start without reranking. Add text_similarity_reranker if relevance isn't good enough after tuning:

POST /my-index/_search
{
  "retriever": {
    "text_similarity_reranker": {
      "retriever": { "rrf": { "retrievers": [ ... ] } },
      "field": "content",
      "inference_id": "my-reranker-endpoint",
      "inference_text": "your query",
      "rank_window_size": 50
    }
  }
}

EIS provides managed rerankers (currently Jina Reranker v2 and v3). Check reranker docs for current setup.

Production Optimization

Quantization — reduces vector memory footprint:

Type	Memory Reduction	Recall Impact
`hnsw`	Baseline	Baseline
`int8_hnsw`	~4x	Minimal
`int4_hnsw`	~8x	Small
`bbq_hnsw`	~32x	Moderate

Shard sizing — target 10-50 GB per shard, max 200M docs per shard.

Common Follow-ups

Question	Answer
"Results aren't relevant"	Tune `window_size` and `rank_constant` in RRF. Run `_rank_eval` with test queries.
"Memory is too high"	Add `int8_hnsw` quantization to `dense_vector` mapping and reindex.
"How do I weight keyword vs semantic?"	Adjust RRF `window_size` — higher favors semantic, lower favors BM25.

vector-hybrid-search

同仓库更多 Skills

同仓库更多 Skills

Vector & Hybrid Search Guide

Conversation flow — return to onboarding

Decision: Embedding Strategy

Decision: Vector Field Type

semantic_text Mapping (Default)

dense_vector Mapping

Decision: Search Type

Semantic Search (semantic_text)

Pure kNN (dense_vector)

Hybrid Search with RRF

Reranking

Production Optimization

Common Follow-ups

Vector & Hybrid Search Guide

Conversation flow — return to onboarding

Decision: Embedding Strategy

Decision: Vector Field Type

semantic_text Mapping (Default)

dense_vector Mapping

Decision: Search Type

Semantic Search (semantic_text)

Pure kNN (dense_vector)

Hybrid Search with RRF

Reranking

Production Optimization

Common Follow-ups

`semantic_text` Mapping (Default)

`dense_vector` Mapping

Semantic Search (`semantic_text`)

Pure kNN (`dense_vector`)

`semantic_text` Mapping (Default)

`dense_vector` Mapping

Semantic Search (`semantic_text`)

Pure kNN (`dense_vector`)