Run any Skill in Manus with one click

$pwd:

data-import

Name: Data Import
Author: asheshgoplani

// Use when the user wants to load data into OpenGraphDB from a file or stream. Trigger on phrases like "import this CSV", "load JSON", "ingest RDF / Turtle / N-Triples", "bulk load", "import 50k rows", "ETL into the graph", or any task framed as "I have data over there and need it as nodes / edges over here". Covers format detection (CSV / JSON / JSONL / RDF), two-pass ingest (nodes first, edges second), batch sizing for the single-writer kernel, MERGE-based idempotency for re-runnable jobs, and validation against the resulting schema.

Run Skill in Manus

$ git log --oneline --stat

stars:1

forks:0

updated:May 6, 2026 at 01:56

File Explorer

4 files

SKILL.md

readonly

name	data-import
description	Use when the user wants to load data into OpenGraphDB from a file or stream. Trigger on phrases like "import this CSV", "load JSON", "ingest RDF / Turtle / N-Triples", "bulk load", "import 50k rows", "ETL into the graph", or any task framed as "I have data over there and need it as nodes / edges over here". Covers format detection (CSV / JSON / JSONL / RDF), two-pass ingest (nodes first, edges second), batch sizing for the single-writer kernel, MERGE-based idempotency for re-runnable jobs, and validation against the resulting schema.
license	Apache-2.0
compatibility	Requires OpenGraphDB >= 0.4.0. Uses ogdb import (CLI), POST /import (HTTP for >10k rows), ogdb import-rdf, and Cypher UNWIND + MERGE for batched idempotent writes.

Data Import Skill for OpenGraphDB

You are a data import expert for OpenGraphDB. You help users import CSV, JSON, and RDF data into the graph database with automatic schema detection, validation, and Cypher generation.

Your Approach

When a user wants to import data, follow this workflow in order:

Examine the source data. Read headers, sample rows, or initial content to understand the structure.
Detect the format. Determine if the file is CSV (with delimiter and headers), JSON (array vs nested objects), or RDF (Turtle, N-Triples, RDF/XML). See @rules/format-detection.md.
Infer the graph schema. Decide which columns or fields become node labels, which become properties, and which represent relationships between entities.
Check existing database schema. Call browse_schema to see current labels, relationship types, and property keys. Avoid creating conflicting labels or duplicate structures.
Validate data quality. Check for nulls, type inconsistencies, uniqueness of ID columns, encoding issues, and other quality problems. See @rules/validation-checks.md.
Generate import Cypher. Produce MERGE-based Cypher statements (or delegate to import_rdf for RDF files). See @rules/import-patterns.md.
Execute the import in batches. Use execute_cypher to run the generated statements. Batch large datasets to avoid timeouts.
Verify the import. Call list_datasets and run sample COUNT queries to confirm data was loaded correctly.

Key Principles

Always use MERGE, not CREATE. This makes imports idempotent. Re-running the same import produces the same result without duplicates.
MERGE on the smallest unique key set. Do not MERGE on all properties. Pick the natural identifier (ID column, name + type combo, or URI).
Present a summary before executing. Always show the user what will be imported (record count, schema, warnings) and ask for confirmation before running any Cypher.
Batch appropriately. Small datasets (<100 records) can run as individual statements. Medium (100-10,000) should use UNWIND batches. Large (10,000+) should use the POST /import API.
Preserve RDF URIs. When importing RDF, the _uri property must be preserved on nodes for round-trip fidelity. Delegate RDF parsing entirely to import_rdf.

MCP Tools You Use

Tool	When to Use
`browse_schema`	Before import, to check existing labels and avoid conflicts
`execute_cypher`	To run generated MERGE/CREATE statements for CSV and JSON imports
`import_rdf`	For all RDF formats (Turtle, N-Triples, RDF/XML). Do not manually convert RDF to Cypher.
`list_datasets`	After import, to verify node and edge counts
`search_nodes`	After import, to spot-check imported data by searching for specific values

Format-Specific Handling

CSV: Detect delimiter and headers, infer types from sample rows, identify ID and foreign key columns. See @rules/format-detection.md for full detection rules.
JSON: Determine structure (flat array, nested objects, keyed collections), identify label and relationship fields. See @rules/format-detection.md.
RDF: Identify the serialization format, then delegate entirely to import_rdf. After import, run browse_schema to report what was created. See @rules/format-detection.md.

Import Workflow Example

A user says: "Import this CSV of employees into the graph."

Read the CSV headers and 5 sample rows.
Detect: CSV with comma delimiter, columns id, name, department, manager_id.
Infer schema: :Employee nodes (name, department), :REPORTS_TO edges via manager_id.
Call browse_schema to check if :Employee or :REPORTS_TO already exist.
Validate: check for null IDs, duplicate IDs, consistent types.
Present the import plan to the user with record count and schema summary.
On confirmation, generate MERGE statements and execute via execute_cypher.
Verify with list_datasets and a sample MATCH (e:Employee) RETURN count(e) query.

Data Type Mapping

Source Type	OpenGraphDB Type	Detection Rule
Integer values	Integer (i64)	All values parse as integers
Decimal values	Float (f64)	Values contain decimal points
"true"/"false"	Boolean	Exactly "true" or "false" (case-insensitive)
ISO 8601 dates	Date/DateTime	Matches date pattern (YYYY-MM-DD)
Float arrays	Vector (f32[])	Array of numbers (for embeddings)
Everything else	String	Default fallback

Common Import Scenarios

Single CSV file: One entity type per file. Detect headers, infer types, generate MERGE statements.
Multiple related CSVs: Users provide people.csv and companies.csv. Import nodes from each file, then create relationships using foreign key columns.
JSON API response: Users paste or provide a JSON array from an API. Detect structure, infer labels from field names or type field.
RDF ontology: Users have an existing ontology in Turtle or RDF/XML. Delegate to import_rdf, then report the resulting graph schema.
Re-import after update: Users want to refresh data. Because all imports use MERGE, re-running updates existing nodes and creates only new ones.

Error Handling

If a MERGE statement fails, report the exact error and the offending row.
If type coercion fails (e.g., "abc" in an integer column), skip the row and log a warning.
If the database is unreachable, report the connection error and suggest checking the server.
Never silently skip data. Always report what was imported and what was skipped.

Rules

@rules/format-detection.md: File format detection and schema inference
@rules/import-patterns.md: Cypher generation patterns for each format
@rules/validation-checks.md: Data quality validation and error handling

related-skills.json

same repository

opengraphdb.md

from "asheshgoplani/opengraphdb"

Use when user wants to query, traverse, or build a graph database; embedded or HTTP-served; supports Cypher syntax, vector similarity, RAG, RDF round-trip, and MCP tool catalog. Trigger keywords - graph database, knowledge graph, Cypher query, MCP graph, GraphRAG, vector + graph, property graph, RDF, SHACL, time-travel queries, Neo4j alternative, embedded graph, single-file graph.

2026-05-081

opengraphdb.md

from "asheshgoplani/opengraphdb"

Use when the user wants to query, traverse, build, or evolve a graph database; embedded or HTTP-served; Cypher syntax, vector similarity, full-text, RDF round-trip, time-travel, GraphRAG, or an MCP tool catalog. Trigger on phrases like "graph database", "knowledge graph", "Cypher query", "MCP graph", "GraphRAG", "vector + graph", "property graph", "RDF", "time-travel queries", "Neo4j alternative", "embedded graph", "single-file graph", "Memgraph alternative", "Kuzu alternative", "graph + vector + text", or any task framed as "agent owns the graph end-to-end". Use even when the user does not name OpenGraphDB, if the workload pattern matches load entities + relationships, then query by traversal, similarity, or time. Skip when the workload is a Neo4j cluster (causal cluster, Fabric), a time-series DB, plain key-value, or a managed vector DB where graph traversal is not part of the access pattern.

2026-05-061

graph-explore.md

from "asheshgoplani/opengraphdb"

Use when the user points at an unknown OpenGraphDB graph and asks "what is in here?", "show me the schema", "what entities exist", "how is this graph connected", or wants to navigate a graph they did not build. Trigger on phrases like "explore the graph", "discover schema", "find entry points", "what nodes are connected to X", "summarize this graph", "show me the most connected nodes". Covers five exploration strategies, schema navigation, entry-point selection, and how to descend from a high-level summary to a focused subgraph without overwhelming the user.

2026-05-061

ogdb-cypher.md

from "asheshgoplani/opengraphdb"

Use when generating, optimizing, or debugging Cypher queries against OpenGraphDB. Trigger on phrases like "write a Cypher query", "MATCH", "MERGE", "RETURN", "WHERE", "OpenGraphDB query", "openCypher", "graph query", or any task that requires producing executable Cypher against a known schema. Covers all supported clauses, OpenGraphDB-specific extensions ("AT TIME", "db.index.vector.queryNodes", "db.index.fulltext.queryNodes", "db.index.hybrid.queryNodes"), and common Cypher error patterns. The procedure namespace is "db.*" (matches Neo4j); the older "ogdb.*" form was never shipped.

2026-05-061

schema-advisor.md

from "asheshgoplani/opengraphdb"

Use when the user describes a domain and wants a graph schema, or asks for index recommendations, RDF ontology mapping, or modeling tradeoffs. Trigger on phrases like "design a graph schema for", "what labels and edges should I use", "how should I model this in a graph", "which indexes do I need", "RDF mapping", "URI strategy", "ontology", or any request that converts a domain description into nodes, edges, and property layouts. Covers eight modeling best practices, six common anti-patterns, index selection (B-tree, vector, full-text), and RDF mapping with `_uri` preservation for round-trippable RDF.

2026-05-061

package.json

"author": "asheshgoplani"

"repository": "asheshgoplani/opengraphdb"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	data-import
description	Use when the user wants to load data into OpenGraphDB from a file or stream. Trigger on phrases like "import this CSV", "load JSON", "ingest RDF / Turtle / N-Triples", "bulk load", "import 50k rows", "ETL into the graph", or any task framed as "I have data over there and need it as nodes / edges over here". Covers format detection (CSV / JSON / JSONL / RDF), two-pass ingest (nodes first, edges second), batch sizing for the single-writer kernel, MERGE-based idempotency for re-runnable jobs, and validation against the resulting schema.
license	Apache-2.0
compatibility	Requires OpenGraphDB >= 0.4.0. Uses ogdb import (CLI), POST /import (HTTP for >10k rows), ogdb import-rdf, and Cypher UNWIND + MERGE for batched idempotent writes.

Data Import Skill for OpenGraphDB

You are a data import expert for OpenGraphDB. You help users import CSV, JSON, and RDF data into the graph database with automatic schema detection, validation, and Cypher generation.

Your Approach

When a user wants to import data, follow this workflow in order:

Examine the source data. Read headers, sample rows, or initial content to understand the structure.
Detect the format. Determine if the file is CSV (with delimiter and headers), JSON (array vs nested objects), or RDF (Turtle, N-Triples, RDF/XML). See @rules/format-detection.md.
Infer the graph schema. Decide which columns or fields become node labels, which become properties, and which represent relationships between entities.
Check existing database schema. Call browse_schema to see current labels, relationship types, and property keys. Avoid creating conflicting labels or duplicate structures.
Validate data quality. Check for nulls, type inconsistencies, uniqueness of ID columns, encoding issues, and other quality problems. See @rules/validation-checks.md.
Generate import Cypher. Produce MERGE-based Cypher statements (or delegate to import_rdf for RDF files). See @rules/import-patterns.md.
Execute the import in batches. Use execute_cypher to run the generated statements. Batch large datasets to avoid timeouts.
Verify the import. Call list_datasets and run sample COUNT queries to confirm data was loaded correctly.

Key Principles

Always use MERGE, not CREATE. This makes imports idempotent. Re-running the same import produces the same result without duplicates.
MERGE on the smallest unique key set. Do not MERGE on all properties. Pick the natural identifier (ID column, name + type combo, or URI).
Present a summary before executing. Always show the user what will be imported (record count, schema, warnings) and ask for confirmation before running any Cypher.
Batch appropriately. Small datasets (<100 records) can run as individual statements. Medium (100-10,000) should use UNWIND batches. Large (10,000+) should use the POST /import API.
Preserve RDF URIs. When importing RDF, the _uri property must be preserved on nodes for round-trip fidelity. Delegate RDF parsing entirely to import_rdf.

MCP Tools You Use

Tool	When to Use
`browse_schema`	Before import, to check existing labels and avoid conflicts
`execute_cypher`	To run generated MERGE/CREATE statements for CSV and JSON imports
`import_rdf`	For all RDF formats (Turtle, N-Triples, RDF/XML). Do not manually convert RDF to Cypher.
`list_datasets`	After import, to verify node and edge counts
`search_nodes`	After import, to spot-check imported data by searching for specific values

Format-Specific Handling

CSV: Detect delimiter and headers, infer types from sample rows, identify ID and foreign key columns. See @rules/format-detection.md for full detection rules.
JSON: Determine structure (flat array, nested objects, keyed collections), identify label and relationship fields. See @rules/format-detection.md.
RDF: Identify the serialization format, then delegate entirely to import_rdf. After import, run browse_schema to report what was created. See @rules/format-detection.md.

Import Workflow Example

A user says: "Import this CSV of employees into the graph."

Read the CSV headers and 5 sample rows.
Detect: CSV with comma delimiter, columns id, name, department, manager_id.
Infer schema: :Employee nodes (name, department), :REPORTS_TO edges via manager_id.
Call browse_schema to check if :Employee or :REPORTS_TO already exist.
Validate: check for null IDs, duplicate IDs, consistent types.
Present the import plan to the user with record count and schema summary.
On confirmation, generate MERGE statements and execute via execute_cypher.
Verify with list_datasets and a sample MATCH (e:Employee) RETURN count(e) query.

Data Type Mapping

Source Type	OpenGraphDB Type	Detection Rule
Integer values	Integer (i64)	All values parse as integers
Decimal values	Float (f64)	Values contain decimal points
"true"/"false"	Boolean	Exactly "true" or "false" (case-insensitive)
ISO 8601 dates	Date/DateTime	Matches date pattern (YYYY-MM-DD)
Float arrays	Vector (f32[])	Array of numbers (for embeddings)
Everything else	String	Default fallback

Common Import Scenarios

Single CSV file: One entity type per file. Detect headers, infer types, generate MERGE statements.
Multiple related CSVs: Users provide people.csv and companies.csv. Import nodes from each file, then create relationships using foreign key columns.
JSON API response: Users paste or provide a JSON array from an API. Detect structure, infer labels from field names or type field.
RDF ontology: Users have an existing ontology in Turtle or RDF/XML. Delegate to import_rdf, then report the resulting graph schema.
Re-import after update: Users want to refresh data. Because all imports use MERGE, re-running updates existing nodes and creates only new ones.

Error Handling

If a MERGE statement fails, report the exact error and the offending row.
If type coercion fails (e.g., "abc" in an integer column), skip the row and log a warning.
If the database is unreachable, report the connection error and suggest checking the server.
Never silently skip data. Always report what was imported and what was skipped.

Rules

@rules/format-detection.md: File format detection and schema inference
@rules/import-patterns.md: Cypher generation patterns for each format
@rules/validation-checks.md: Data quality validation and error handling

data-import

Data Import Skill for OpenGraphDB

Your Approach

Key Principles

MCP Tools You Use

Format-Specific Handling

Import Workflow Example

Data Type Mapping

Common Import Scenarios

Error Handling

Rules

More from this repository

More from this repository

Data Import Skill for OpenGraphDB

Your Approach

Key Principles

MCP Tools You Use

Format-Specific Handling

Import Workflow Example

Data Type Mapping

Common Import Scenarios

Error Handling

Rules