Skip to main content
Run any Skill in Manus
with one click

doc-to-vector-dataset-generator

Converts documents into clean, chunked datasets suitable for embeddings and vector search. Produces chunked JSONL files with metadata, deduplication logic, and quality checks. Use when preparing "training data", "vector datasets", "document processing", or "embedding data".

Overview

Converts documents into clean, chunked datasets suitable for embeddings and vector search. Produces chunked JSONL files with metadata, deduplication logic, and quality checks. Use when preparing "training data", "vector datasets", "document processing", or "embedding data".

Install command
npx skills add https://github.com/patricio0312rev/skillset --skill doc-to-vector-dataset-generator

Copy and paste this command into Claude Code to install the skill

Stars5
Forks0
UpdatedDecember 31, 2025 at 05:05
SKILL.md
readonly