| name | querying-from-seekdb |
| description | Query and export data from seekdb vector database. Supports two search modes: (1) Scalar search - metadata filtering only, (2) Hybrid search - fulltext + semantic search combined. The --query-text parameter is used for BOTH fulltext ($contains) and semantic (query_texts) search simultaneously. Can export results to CSV/Excel. |
| license | MIT |
Query and Export Data from seekdb
Query data from seekdb vector database with support for scalar search, hybrid search (fulltext + semantic), and export to CSV/Excel files.
Path Convention
Note: All paths in this document (e.g., scripts/) are relative to THIS skill directory, not the project root.
Prerequisites
- Python 3.10+ installed
- Data imported into seekdb collection
- Required packages:
pip install pyseekdb pandas openpyxl
โ ๏ธ CRITICAL: Execution Workflow
MUST FOLLOW this workflow when handling user search requests:
Step 1: Get Collection Information (If Not Already Known)
Before constructing any query, you MUST understand the data structure. However, you should cache this information within the conversation.
Caching Rules:
- โ
First query for a collection: Execute
--info to get metadata structure
- โ
Subsequent queries for the SAME collection: Use cached info from earlier in conversation, skip
--info
- โ
Query for a DIFFERENT collection: Execute
--info for the new collection
- โ
User explicitly asks for collection info: Execute
--info
python scripts/query_from_seekdb.py <collection_name> --info
This shows:
- Total record count
- Available metadata field names (e.g.,
source, year, category)
- Sample documents
Example conversation flow:
User: "ๆพ seekdb_demo ไธญ 2023 ๅนด็ๆ็จ"
โ Claude Code: ๆง่ก --info (็ฌฌไธๆฌกๆฅ่ฏขๆญค collection)
โ ๅ็ฐ metadata ๆ source, year ๅญๆฎต
โ ๆง่กๆ็ดข
User: "ๅๆพไธไธ notion ๆฅๆบ็"
โ Claude Code: ไธ้่ฆๅๆง่ก --info (ๅไธ collection๏ผ็ปๆๅทฒ็ฅ)
โ ็ดๆฅๆง่กๆ็ดข
User: "ๆฅไธไธ another_collection ไธญ็ๆฐๆฎ"
โ Claude Code: ๆง่ก --info (ไธๅ collection)
โ ไบ่งฃๆฐ collection ็็ปๆ
โ ๆง่กๆ็ดข
Step 2: Analyze User Request
Parse the user's natural language request to identify:
| Component | Look For | Maps To |
|---|
| Metadata conditions | Field-value pairs like "2023ๅนด", "ๆฅ่ชnotion", "ไปทๆ ผ<100" | --where filter |
| Content/Semantic search | Keywords, concepts, descriptions, questions | --query-text (used for BOTH fulltext and semantic) |
Important: --query-text is used for BOTH fulltext search ($contains) and semantic search (query_texts) simultaneously. The same text is used for both.
Step 3: Choose Search Method
User Request Analysis
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Does the request involve ONLY metadata field conditions? โ
โ (e.g., "year=2023", "source=notion", no content search) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโ YES โโโบ Scalar Search: --where only
โ
โโโ NO โโโโบ Does it involve content/semantic search?
โ
โโโ YES (no metadata) โโโบ Hybrid Search: --query-text only
โ
โโโ YES (with metadata) โโโบ Scalar + Hybrid: --where + --query-text
Two Search Modes
Mode 1: Scalar Search (Metadata Only)
When to use: User wants to filter by metadata fields ONLY, no content/semantic search needed.
python scripts/query_from_seekdb.py seekdb_demo --where '{"source": "notion", "year": 2023}'
Example requests:
- "ๆพๅบๆๆๆฅ่ช notion ็ๆๆกฃ"
- "ๆพ็คบ 2023 ๅนด็่ฎฐๅฝ"
- "source ๆฏ google-docs ็ๆฐๆฎ"
Mode 2: Hybrid Search (Fulltext + Semantic)
When to use: User wants to search by content - the query text is used for BOTH fulltext matching AND semantic similarity.
python scripts/query_from_seekdb.py seekdb_demo --query-text "seekdb ๆ็จ"
How it works:
--query-text "seekdb ๆ็จ" โ Fulltext: where_document: {"$contains": "seekdb ๆ็จ"} + Semantic: query_texts: "seekdb ๆ็จ"
- Results are ranked using RRF (Reciprocal Rank Fusion)
Example requests:
- "ๆพ seekdb ๆ็จ" โ
--query-text "seekdb ๆ็จ"
- "ๆ็ดข python ๆๆฏๆๆกฃ" โ
--query-text "python ๆๆฏๆๆกฃ"
Mode 3: Scalar + Hybrid Search
When to use: User wants metadata filtering + content/semantic search.
python scripts/query_from_seekdb.py seekdb_demo --query-text "seekdb ๆ็จ" --where '{"year": 2023}'
Example requests:
- "่ฏทๆพๅบ seekdb_demo ้ๅไธญ 2023 ๅนดๅ็ seekdb ๆ็จ" โ
--query-text "seekdb ๆ็จ" --where '{"year": 2023}'
- "ๆพ notion ๆฅๆบ็็ผ็จๆๅ" โ
--query-text "็ผ็จๆๅ" --where '{"source": "notion"}'
๐ฏ Real-World Example Analysis
User request: "่ฏทๆพๅบ seekdb_demo ้ๅไธญ 2023 ๅนดๅ็ seekdb ๆ็จ"
Step 1: Run --info to get metadata structure:
python scripts/query_from_seekdb.py seekdb_demo --info
Step 2: Analyze request:
| Part | Type | Filter |
|---|
| "2023 ๅนด" | Metadata field year | --where '{"year": 2023}' |
| "seekdb ๆ็จ" | Content/Semantic search | --query-text "seekdb ๆ็จ" |
Step 3: Execute:
python scripts/query_from_seekdb.py seekdb_demo --query-text "seekdb ๆ็จ" --where '{"year": 2023}'
CLI Reference
Commands
python scripts/query_from_seekdb.py --list-collections
python scripts/query_from_seekdb.py <collection_name> --info
python scripts/query_from_seekdb.py <collection_name> --where '<json_filter>'
python scripts/query_from_seekdb.py <collection_name> --query-text "<text>" [-n <count>]
python scripts/query_from_seekdb.py <collection_name> --query-text "<text>" --where '<json>'
python scripts/query_from_seekdb.py <collection_name> <search_options> --output results.csv
python scripts/query_from_seekdb.py <collection_name> <search_options> --output results.xlsx
Options
| Option | Short | Description |
|---|
--query-text | -q | Text for hybrid search (fulltext + semantic) |
--where | -w | Metadata filter as JSON string |
--n-results | -n | Number of results (default: 5) |
--output | -o | Export to file (.csv or .xlsx) |
--json | -j | Output as JSON |
--info | | Show collection info |
--list-collections | -l | List all collections |
--include | | Fields to include: documents,metadatas,embeddings |
--sheet-name | -s | Sheet name for Excel export |
Filter Operators
How to Construct --where Parameter
Step 1: Run --info to see available metadata fields:
python scripts/query_from_seekdb.py seekdb_demo --info
Step 2: Use the metadata field names to construct --where:
--where '{"source": "notion"}'
--where '{"year": 2023}'
--where '{"source": "notion", "year": 2023}'
Step 3: Match user request to metadata fields:
| User says | Metadata field | --where value |
|---|
| "2023 ๅนด็" | year | '{"year": 2023}' |
| "ๆฅ่ช notion ็" | source | '{"source": "notion"}' |
| "ไปทๆ ผไฝไบ 100 ็" | price | '{"price": {"$lt": 100}}' |
| "ๅ็ๆฏไธๆๆ่นๆ็" | brand | '{"brand": {"$in": ["Samsung", "Apple"]}}' |
Metadata Filter Operators
| Operator | Description | Example |
|---|
$eq | Equal to | {"year": {"$eq": 2023}} or {"year": 2023} |
$ne | Not equal to | {"status": {"$ne": "deleted"}} |
$gt | Greater than | {"score": {"$gt": 90}} |
$gte | Greater than or equal | {"score": {"$gte": 90}} |
$lt | Less than | {"score": {"$lt": 50}} |
$lte | Less than or equal | {"score": {"$lte": 50}} |
$in | In list | {"tag": {"$in": ["ml", "ai"]}} |
$nin | Not in list | {"tag": {"$nin": ["old"]}} |
$and | Logical AND | {"$and": [{"year": 2023}, {"source": "notion"}]} |
$or | Logical OR | {"$or": [{"year": 2023}, {"year": 2024}]} |
Complex Filter Examples
--where '{"source": "notion", "year": 2023}'
--where '{"$and": [{"source": "notion"}, {"year": {"$gte": 2023}}]}'
--where '{"$or": [{"source": "notion"}, {"source": "google-docs"}]}'
--where '{"$and": [{"year": {"$gte": 2022}}, {"year": {"$lte": 2024}}]}'
--where '{"$and": [{"year": 2023}, {"$or": [{"source": "notion"}, {"source": "obsidian"}]}]}'
Export to CSV/Excel
python scripts/query_from_seekdb.py mobiles --where '{"Brand": "SAMSUNG"}' --output samsung.csv
python scripts/query_from_seekdb.py mobiles --query-text "good camera" --output results.xlsx
python scripts/query_from_seekdb.py mobiles --query-text "phone" --output phones.xlsx --sheet-name "Search Results"
Supported Export Formats
| Format | Extension | Description |
|---|
| CSV | .csv | Comma-separated values, UTF-8 encoded with BOM |
| Excel | .xlsx | Excel workbook format |
Data Structure in seekdb
seekdb stores data in two distinct locations:
| Storage | Description | Filter Method | Example |
|---|
| Metadata | Structured key-value fields | --where | {"source": "notion", "year": 2023} |
| Document | Text content | --query-text (hybrid search) | Fulltext + Semantic search |
Connection Configuration
Set environment variables for server mode:
| Variable | Description | Default |
|---|
SEEKDB_HOST | Server host (if set, uses server mode) | - |
SEEKDB_PORT | Server port | 2881 |
SEEKDB_DATABASE | Database name | test |
SEEKDB_USER | Username | root |
SEEKDB_PASSWORD | Password | - |
References