一键导入
deduplicate
// Detect duplicate entities, duplicate groups, and relationship duplicates in Semantica using fuzzy matching, schema heuristics, and graph similarity.
// Detect duplicate entities, duplicate groups, and relationship duplicates in Semantica using fuzzy matching, schema heuristics, and graph similarity.
Semantica full-stack knowledge graph skill for context graphs, decision intelligence, explainability, extraction, reasoning, visualization, ontology, provenance, policy, and export workflows.
Analyze cause-and-effect relationships in the Semantica knowledge graph — causal chains, interventions, counterfactuals, and causal influence scores.
Explain Semantica reasoning, decision logic, and graph results with traceability, causal context, and human-readable rationale.
Export Semantica graphs, results, and provenance to JSON, RDF, Parquet, CSV, GraphML, and other formats.
Ingest data from files, databases, APIs, or streams into Semantica knowledge graphs with schema mapping and entity linking.
Track and inspect graph changes, diffs, temporal updates, and the impact of new data on Semantica knowledge graphs.
| name | deduplicate |
| description | Detect duplicate entities, duplicate groups, and relationship duplicates in Semantica using fuzzy matching, schema heuristics, and graph similarity. |
Remove duplicates from the knowledge graph. Usage: /semantica:deduplicate <strategy> [args]
$ARGUMENTS = deduplication strategy + optional entity or threshold.
entities [--threshold <score>] [--field <name>]Detect duplicate entities and group them by similarity.
from semantica.deduplication import DuplicateDetector
finder = DuplicateDetector()
candidates = finder.detect_duplicates(entities, threshold=threshold)
groups = finder.detect_duplicate_groups(entities, threshold=threshold)
Output: duplicate candidate list, duplicate groups, and representative merge recommendations.
relations [--similarity <score>]Detect duplicate relationships and normalize edge representations.
from semantica.deduplication import DuplicateDetector
finder = DuplicateDetector()
relations = finder.detect_duplicates(relation_list, threshold=similarity)
Result: duplicate relation candidates, normalized relationship groups, and cleanup summary.