Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

dynamo-geneid-convert

Name: Dynamo Geneid Convert
Author: aristoteleo

// Convert Ensembl-style gene IDs to gene symbols in `dynamo` with `dynamo.preprocessing.convert2gene_symbol` or `dynamo.preprocessing.convert2symbol`, including human and zebrafish IDs, version-suffix stripping, `AnnData.var_names` updates, and optional preprocessing handoff. Use when adapting `docs/tutorials/notebooks/110_geneid_convert_tutorial.ipynb`, standardizing `adata.var_names`, mapping Ensembl IDs to symbols, or doing identifier cleanup before a `Preprocessor` recipe.

Ejecutar en Manus

$ git log --oneline --stat

stars:500

forks:60

updated:19 de marzo de 2026, 06:17

Explorador de archivos

7 archivos

SKILL.md

readonly

name

dynamo-geneid-convert

description

Convert Ensembl-style gene IDs to gene symbols in `dynamo` with `dynamo.preprocessing.convert2gene_symbol` or `dynamo.preprocessing.convert2symbol`, including human and zebrafish IDs, version-suffix stripping, `AnnData.var_names` updates, and optional preprocessing handoff. Use when adapting `docs/tutorials/notebooks/110_geneid_convert_tutorial.ipynb`, standardizing `adata.var_names`, mapping Ensembl IDs to symbols, or doing identifier cleanup before a `Preprocessor` recipe.

Dynamo Gene ID Convert

Goal

Convert Ensembl-style identifiers to gene symbols in a way that another agent can actually rerun: choose the correct conversion path, preserve traceability columns, handle species and version-suffix edge cases, and only hand off to Preprocessor after IDs are standardized.

Quick Workflow

Inspect adata.var_names or the raw ID list and determine whether the user needs a batch table or an in-place AnnData update.
Strip version suffixes such as .1 into a query column before you treat mapping quality as final.
Use convert2gene_symbol(...) when you need an explicit mapping table, reproducible merge logic, or non-human species control.
Use convert2symbol(adata, ...) only when in-place AnnData mutation is the right abstraction and the prefix / scopes rules are satisfied.
Keep the original identifier in adata.var, set adata.var_names from symbol only after validating duplicates and missing mappings, and run preprocessing afterward, not before.

Interface Summary

convert2gene_symbol(input_names, scopes='ensembl.gene', ensembl_release=None, species=None, force_rebuild=False) returns a DataFrame indexed by query with at least symbol and _score.
The live source uses vendored pyensembl data access, not a remote MyGene batch API.
convert2symbol(adata, scopes=None, subset=True) updates adata.var, adds query plus symbol, and can rewrite adata.var_names in place.
Preprocessor.preprocess_adata(adata, recipe='monocle', tkey=None, experiment_type=None) is only the downstream handoff. Current source exposes five recipe branches: monocle, seurat, sctransform, pearson_residuals, monocle_pearson_residuals.

Read references/source-grounding.md before documenting parameters more narrowly than the notebook does.

Conversion Path Selection

Use convert2gene_symbol(...) as the default for notebook conversion work. It is the safest path when you need explicit species, ensembl_release, manual merge logic, or per-ID validation.
Use convert2symbol(adata) for human or mouse Ensembl-style adata.var_names when direct in-place conversion is acceptable.
Use convert2symbol(adata, scopes='ensembl.gene') for zebrafish or other non-human prefixes if you still want the in-place helper.
Do not trust the notebook's generic scopes explanation alone. In current source, convert2gene_symbol keeps scopes only for compatibility, while convert2symbol still branches on scopes and prefix heuristics.

Minimal Execution Patterns

For an explicit mapping table and controlled merge:

import dynamo as dyn

adata = dyn.sample_data.hematopoiesis_raw()
adata.var["ensembl_id"] = adata.var_names
adata.var["query"] = adata.var_names.str.split(".").str[0]

mapping = dyn.preprocessing.convert2gene_symbol(
    adata.var["query"].tolist(),
    species="human",
)

adata.var = (
    adata.var
    .merge(mapping, left_on="query", right_index=True, how="left")
    .set_index(adata.var.index)
)

mapped = adata.var["symbol"].notna()
adata = adata[:, mapped].copy()
adata.var_names = adata.var["symbol"].astype(str)

For in-place conversion on supported prefixes:

import dynamo as dyn

adata = dyn.sample_data.hematopoiesis_raw()
adata.var["ensembl_id"] = adata.var_names
dyn.preprocessing.convert2symbol(adata, subset=True)

For zebrafish IDs with version suffixes:

import dynamo as dyn

result = dyn.preprocessing.convert2gene_symbol(
    ["ENSDARG00000035558.1"],
    ensembl_release=77,
)

# Or, if the user wants in-place AnnData mutation:
dyn.preprocessing.convert2symbol(adata, scopes="ensembl.gene")

Optional Preprocess Integration

Perform gene-ID conversion before preprocessing unless the user explicitly wants to preserve notebook timing.
If the user proceeds into preprocessing, default to recipe='monocle' unless they ask for a different branch.
If the user requests Pearson residuals but still cares about downstream velocity-safe layers, prefer monocle_pearson_residuals.

Read references/preprocess-handoff.md before choosing a non-default recipe.

Validation

After conversion, check these items:

adata.var["query"] stores the version-stripped identifier actually used for lookup.
adata.var["symbol"] exists and the mapping rate is acceptable for the dataset.
The original ID remains preserved, for example in adata.var["ensembl_id"].
adata.var_names contains symbols only after duplicate and null handling is explicit.
Representative conversions still match live source behavior: ENSG00000141510 -> TP53 and ENSDARG00000035558(.1) -> gps2 with ensembl_release=77.
If preprocessing follows, Preprocessor.preprocess_adata(..., recipe=...) should run only after symbol assignment is settled.

Constraints

Do not describe conversion as MyGene-backed just because older prose or notebook wording suggests that pattern; current source uses vendored pyensembl.
Do not assume convert2symbol(adata) auto-detects every species. In current source it auto-classifies some human / mouse gene or transcript prefixes, but zebrafish without explicit scopes raises.
Do not assume the docstring's ensembl_release default is authoritative. Current code assigns 77 when the argument is omitted.
The first conversion run may install polars or download / index Ensembl data, so cold-start execution can be slower than the notebook suggests.

Resource Map

Read references/source-grounding.md for inspected signatures, live-source behavior, and branch coverage.
Read references/conversion-paths.md for human vs zebrafish decision rules and scopes handling.
Read references/preprocess-handoff.md for the downstream recipe branches.
Read references/source-notebook-map.md to see which notebook sections were preserved or intentionally dropped.
Read references/compatibility.md when notebook wording and current source behavior appear to disagree.

related-skills.json

mismo repositorio

dynamo-in-silico-perturbation.md

from "aristoteleo/dynamo-release"

Perform in silico gene perturbation on a dynamo vector-field AnnData to predict cell fate diversion after single or multi-gene activation or suppression, then visualize the results with streamline or quiver plots. Use when running dyn.pd.perturbation, predicting transcription factor perturbation effects, simulating gene knockdown or overexpression in scRNA-seq data, reproducing 502_perturbation_tutorial.ipynb, or choosing among pertubation_method, perturb_mode, and emb_basis branches.

2026-03-19500

dynamo-lap-cell-fate-transition.md

from "aristoteleo/dynamo-release"

Compute least action paths (LAP) between hematopoietic or general cell types in a dynamo vector-field AnnData, then rank transcription factors by MSD along each path and evaluate predictions via ROC analysis. Use when running dyn.pd.compute_cell_type_transitions, predicting optimal cell fate conversion trajectories, prioritizing transcription factor cocktails for cell reprogramming, or reproducing the 501_lap_tutorial.ipynb workflow on scNT-seq or metabolic-labeling data.

2026-03-19500

dynamo-conventional-rna-velocity.md

from "aristoteleo/dynamo-release"

Run or adapt a conventional spliced/unspliced RNA velocity workflow in `dynamo`, including `Preprocessor` preprocessing, `dynamics`, low-dimensional `cell_velocities`, `VectorField`, topology / potential analysis, confidence-based correction, fate prediction, and optional animation. Use when analyzing conventional scRNA-seq `AnnData`, reproducing or adapting tutorial notebooks such as `200_zebrafish.ipynb`, or selecting between preprocessing, kinetics, vector-field, and fate stages for a reusable velocity pipeline.

2026-03-19500

dynamo-differential-geometry-analysis.md

from "aristoteleo/dynamo-release"

Run downstream differential-geometry analysis on a `dynamo` vector-field `AnnData`, including velocity, acceleration, curvature, Jacobian, regulatory-network, ddhodge pseudotime, and state-graph branches. Use when adapting the `403_Differential_geometry.ipynb` tutorial, extending a conventional spliced/unspliced RNA velocity workflow into vector calculus, or choosing among `method`, `mode`, `sampling`, `formula`, `adjmethod`, or `gene_order_method` branches.

2026-03-19500

dynamo-lineage-appearance-analysis.md

from "aristoteleo/dynamo-release"

Compare lineage appearance timing and its regulators on a precomputed `dynamo` vector-field `AnnData` using topography, graph potentials, Jacobian, and vector-calculus outputs. Use when checking whether one lineage appears earlier than its peers, curating fixed points, analyzing regulator pairs on a downstream-ready vector field, or adapting `400_tutorial_hsc_dynamo_megakaryocytes_appearance.ipynb`.

2026-03-19500

dynamo-one-shot-total-rna-velocity.md

from "aristoteleo/dynamo-release"

Run or adapt a one-shot total RNA velocity workflow in `dynamo` for metabolic-labeling or scNT-seq `AnnData`, including monocle preprocessing with an optional curated gene list, grouped moments by labeling time, Model-2 `dynamics`, `calculate_velocity_alpha_minus_gamma_s`, low-dimensional projection with `cell_velocities`, and optional streamline or phase-portrait plotting. Use when converting tutorials such as `301_tutorial_hsc_velocity.ipynb`, or when choosing between `one_shot_method` branches like `sci_fate` and `combined` and projection `method` branches like `cosine` and `pearson`.

2026-03-19500

package.json

"author": "aristoteleo"

"repository": "aristoteleo/dynamo-release"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Científicos de datosOcupaciones informáticas y matemáticas15-2051L4

name

dynamo-geneid-convert

description

Dynamo Gene ID Convert

Goal

Quick Workflow

Inspect adata.var_names or the raw ID list and determine whether the user needs a batch table or an in-place AnnData update.
Strip version suffixes such as .1 into a query column before you treat mapping quality as final.
Use convert2gene_symbol(...) when you need an explicit mapping table, reproducible merge logic, or non-human species control.
Use convert2symbol(adata, ...) only when in-place AnnData mutation is the right abstraction and the prefix / scopes rules are satisfied.
Keep the original identifier in adata.var, set adata.var_names from symbol only after validating duplicates and missing mappings, and run preprocessing afterward, not before.

Interface Summary

convert2gene_symbol(input_names, scopes='ensembl.gene', ensembl_release=None, species=None, force_rebuild=False) returns a DataFrame indexed by query with at least symbol and _score.
The live source uses vendored pyensembl data access, not a remote MyGene batch API.
convert2symbol(adata, scopes=None, subset=True) updates adata.var, adds query plus symbol, and can rewrite adata.var_names in place.
Preprocessor.preprocess_adata(adata, recipe='monocle', tkey=None, experiment_type=None) is only the downstream handoff. Current source exposes five recipe branches: monocle, seurat, sctransform, pearson_residuals, monocle_pearson_residuals.

Read references/source-grounding.md before documenting parameters more narrowly than the notebook does.

Conversion Path Selection

Use convert2gene_symbol(...) as the default for notebook conversion work. It is the safest path when you need explicit species, ensembl_release, manual merge logic, or per-ID validation.
Use convert2symbol(adata) for human or mouse Ensembl-style adata.var_names when direct in-place conversion is acceptable.
Use convert2symbol(adata, scopes='ensembl.gene') for zebrafish or other non-human prefixes if you still want the in-place helper.
Do not trust the notebook's generic scopes explanation alone. In current source, convert2gene_symbol keeps scopes only for compatibility, while convert2symbol still branches on scopes and prefix heuristics.

Minimal Execution Patterns

For an explicit mapping table and controlled merge:

import dynamo as dyn

adata = dyn.sample_data.hematopoiesis_raw()
adata.var["ensembl_id"] = adata.var_names
adata.var["query"] = adata.var_names.str.split(".").str[0]

mapping = dyn.preprocessing.convert2gene_symbol(
    adata.var["query"].tolist(),
    species="human",
)

adata.var = (
    adata.var
    .merge(mapping, left_on="query", right_index=True, how="left")
    .set_index(adata.var.index)
)

mapped = adata.var["symbol"].notna()
adata = adata[:, mapped].copy()
adata.var_names = adata.var["symbol"].astype(str)

For in-place conversion on supported prefixes:

import dynamo as dyn

adata = dyn.sample_data.hematopoiesis_raw()
adata.var["ensembl_id"] = adata.var_names
dyn.preprocessing.convert2symbol(adata, subset=True)

For zebrafish IDs with version suffixes:

import dynamo as dyn

result = dyn.preprocessing.convert2gene_symbol(
    ["ENSDARG00000035558.1"],
    ensembl_release=77,
)

# Or, if the user wants in-place AnnData mutation:
dyn.preprocessing.convert2symbol(adata, scopes="ensembl.gene")

Optional Preprocess Integration

Perform gene-ID conversion before preprocessing unless the user explicitly wants to preserve notebook timing.
If the user proceeds into preprocessing, default to recipe='monocle' unless they ask for a different branch.
If the user requests Pearson residuals but still cares about downstream velocity-safe layers, prefer monocle_pearson_residuals.

Read references/preprocess-handoff.md before choosing a non-default recipe.

Validation

After conversion, check these items:

adata.var["query"] stores the version-stripped identifier actually used for lookup.
adata.var["symbol"] exists and the mapping rate is acceptable for the dataset.
The original ID remains preserved, for example in adata.var["ensembl_id"].
adata.var_names contains symbols only after duplicate and null handling is explicit.
Representative conversions still match live source behavior: ENSG00000141510 -> TP53 and ENSDARG00000035558(.1) -> gps2 with ensembl_release=77.
If preprocessing follows, Preprocessor.preprocess_adata(..., recipe=...) should run only after symbol assignment is settled.

Constraints

Do not describe conversion as MyGene-backed just because older prose or notebook wording suggests that pattern; current source uses vendored pyensembl.
Do not assume convert2symbol(adata) auto-detects every species. In current source it auto-classifies some human / mouse gene or transcript prefixes, but zebrafish without explicit scopes raises.
Do not assume the docstring's ensembl_release default is authoritative. Current code assigns 77 when the argument is omitted.
The first conversion run may install polars or download / index Ensembl data, so cold-start execution can be slower than the notebook suggests.

Resource Map

Read references/source-grounding.md for inspected signatures, live-source behavior, and branch coverage.
Read references/conversion-paths.md for human vs zebrafish decision rules and scopes handling.
Read references/preprocess-handoff.md for the downstream recipe branches.
Read references/source-notebook-map.md to see which notebook sections were preserved or intentionally dropped.
Read references/compatibility.md when notebook wording and current source behavior appear to disagree.

dynamo-geneid-convert

Dynamo Gene ID Convert

Goal

Quick Workflow

Interface Summary

Conversion Path Selection

Minimal Execution Patterns

Optional Preprocess Integration

Validation

Constraints

Resource Map

Más de este repositorio

Más de este repositorio

Dynamo Gene ID Convert

Goal

Quick Workflow

Interface Summary

Conversion Path Selection

Minimal Execution Patterns

Optional Preprocess Integration

Validation

Constraints

Resource Map