Run any Skill in Manus with one click

$pwd:

biomarker-database-analysis

Name: Biomarker Database Analysis
Author: aws-samples

// Use when a researcher needs to query biomedical databases for biomarker discovery, build target profiles from UniProt/Open Targets/STRING, rank biomarker candidates by evidence strength, or generate SQL queries against clinical genomic databases.

Run Skill in Manus

$ git log --oneline --stat

stars:260

forks:101

updated:May 19, 2026 at 16:11

File Explorer

3 files

SKILL.md

readonly

name	biomarker-database-analysis
description	Use when a researcher needs to query biomedical databases for biomarker discovery, build target profiles from UniProt/Open Targets/STRING, rank biomarker candidates by evidence strength, or generate SQL queries against clinical genomic databases.

Biomarker Database Analysis

When to use this skill

Researcher asks to find biomarkers associated with a disease or cancer type
Query clinical genomic databases for survival, gene expression, or mutation data
Build protein/target profiles from biomedical databases
Rank biomarker candidates by statistical evidence (p-value, effect size)
Generate or optimize SQL for biomarker data retrieval

MCP Servers Used

biomni-research — for external biomedical database queries (UniProt, Open Targets, STRING, ClinVar)
Clinical genomic databases may use separate tools (Redshift/Athena) depending on deployment

Workflow: Query Clinical Genomic Database

Step 1: Understand the schema before querying

Always retrieve the database schema first to understand available tables and columns.

Tool: get_schema
Purpose: Retrieve table names, column names, data types, and descriptions

Key columns in a typical clinical genomic table:

case_id -- patient identifier
survival_status -- alive/dead (boolean or 0/1)
survival_duration -- time in days or years
Gene expression columns (e.g., gdf15, lrig1, cdh2, postn, vcan)
Clinical metadata: age_at_histological_diagnosis, smoking_status, chemotherapy, histology

Step 2: Formulate and refine the SQL query

Decision tree for query type:

Patient demographics -> Simple SELECT with WHERE/GROUP BY
Biomarker expression -> SELECT gene columns with clinical filters
Survival correlation -> SELECT survival_status, survival_duration, biomarker columns
Cohort comparison -> GROUP BY with aggregation (COUNT, AVG)

Rules:

Write queries as single lines (no newlines)
Never modify column names from the schema
Use aggregation (COUNT, AVG, GROUP BY) to reduce output size
Always validate with refine_sql before execution

Tool: refine_sql
Input: sql (the query), question (rationale for this step -- not the user's original question)
Purpose: Optimize for efficiency, add aggregation, fix column references

Step 3: Execute and interpret results

Tool: query_redshift (or query_database)
Input: The refined SQL query
Output: Row-level results from the clinical database

Step 4: Build target profiles from external databases

For deeper biomarker validation, use the biomni-research MCP server with natural language queries:

Database	Query approach
UniProt	"CDK4 protein function, domains, post-translational modifications"
Open Targets	"CDK4 disease associations and genetic evidence scores"
STRING	"CDK4 protein-protein interaction network"
ClinVar	"CDK4 pathogenic variants and clinical significance"

Step 5: Rank candidates by evidence strength

Scoring framework for biomarker prioritization:

Evidence type	Weight	Source
Statistical significance (p < 0.05)	High	Cox regression from clinical data
Known pathogenic association	High	ClinVar, Open Targets
Protein interaction in disease network	Medium	STRING (confidence > 0.7)
Literature support (3+ publications)	Medium	PubMed
Gene expression differential	Medium	Clinical database
Functional annotation match	Low	UniProt

Query Patterns

Find top biomarkers for survival:

SELECT survival_status, survival_duration, gdf15, lrig1, cdh2, postn, vcan FROM clinical_genomic WHERE chemotherapy = 'Yes'

Cohort demographics:

SELECT smoking_status, COUNT(DISTINCT case_id) AS num_patients FROM clinical_genomic WHERE age_at_histological_diagnosis > 50 GROUP BY smoking_status

Disease-specific expression:

SELECT survival_status, COUNT(*) AS count FROM clinical_genomic WHERE histology = 'Adenocarcinoma' GROUP BY survival_status

Key Conventions

Map survival_status: False/Alive = 0, True/Dead = 1
Expression values are continuous (higher = more expressed in tumor)
Always include quality filters and use parameterized queries when available
Store query results in shared memory for downstream agents (statistician, pathway analyst)
When results exceed 100 rows, summarize with aggregation before presenting to user

related-skills.json

same repository

biomarker-multi-agent-discovery.md

from "aws-samples/amazon-bedrock-agents-healthcare-lifesciences"

Use when orchestrating a multi-agent biomarker discovery workflow that requires coordinating database queries, pathway analysis, literature review, statistical modeling, and clinical evidence synthesis to produce ranked biomarker panels.

2026-05-19260

biomarker-pathway-analysis.md

from "aws-samples/amazon-bedrock-agents-healthcare-lifesciences"

Use when a researcher needs to analyze biological pathways for biomarker discovery, map disease mechanisms to druggable targets using Reactome/KEGG, identify pathway enrichment from gene sets, or understand mechanism-of-action for candidate biomarkers.

2026-05-19260

genomics-variant-interpretation.md

from "aws-samples/amazon-bedrock-agents-healthcare-lifesciences"

Use when interpreting genomic variants from VCF files, performing clinical variant classification using ClinVar/VEP annotations, analyzing allele frequencies against population data (1000 Genomes), or generating clinical reports for genetic counseling.

2026-05-19260

hcls-build-agent.md

from "aws-samples/amazon-bedrock-agents-healthcare-lifesciences"

Use when a developer wants to build a new healthcare or life sciences agent, structure tools and system prompts for an HCLS workflow, or create a Strands agent with domain-specific capabilities. Also use when someone asks about agent architecture, tool design, or system prompt patterns for clinical, genomics, or drug discovery use cases.

2026-05-19260

hcls-deploy-agent.md

from "aws-samples/amazon-bedrock-agents-healthcare-lifesciences"

Use when a developer wants to deploy an HCLS agent to Amazon Bedrock AgentCore, configure Gateway tools as MCP endpoints, set up authentication with Cognito, configure memory, or register an agent in the AgentCore Registry. Also use for deployment troubleshooting.

2026-05-19260

hcls-get-started.md

from "aws-samples/amazon-bedrock-agents-healthcare-lifesciences"

Use when a developer asks how to get started building healthcare or life sciences agents, wants to understand the HCLS Agents Toolkit, or asks what's available in this repository. Also use when someone asks about genomics, drug discovery, clinical trials, or biomarker agents on AWS.

2026-05-19260

package.json

"author": "aws-samples"

"repository": "aws-samples/amazon-bedrock-agents-healthcare-lifesciences"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Data ScientistsComputer and Mathematical Occupations15-2051L4

name	biomarker-database-analysis
description	Use when a researcher needs to query biomedical databases for biomarker discovery, build target profiles from UniProt/Open Targets/STRING, rank biomarker candidates by evidence strength, or generate SQL queries against clinical genomic databases.

Biomarker Database Analysis

When to use this skill

Researcher asks to find biomarkers associated with a disease or cancer type
Query clinical genomic databases for survival, gene expression, or mutation data
Build protein/target profiles from biomedical databases
Rank biomarker candidates by statistical evidence (p-value, effect size)
Generate or optimize SQL for biomarker data retrieval

MCP Servers Used

biomni-research — for external biomedical database queries (UniProt, Open Targets, STRING, ClinVar)
Clinical genomic databases may use separate tools (Redshift/Athena) depending on deployment

Workflow: Query Clinical Genomic Database

Step 1: Understand the schema before querying

Always retrieve the database schema first to understand available tables and columns.

Tool: get_schema
Purpose: Retrieve table names, column names, data types, and descriptions

Key columns in a typical clinical genomic table:

case_id -- patient identifier
survival_status -- alive/dead (boolean or 0/1)
survival_duration -- time in days or years
Gene expression columns (e.g., gdf15, lrig1, cdh2, postn, vcan)
Clinical metadata: age_at_histological_diagnosis, smoking_status, chemotherapy, histology

Step 2: Formulate and refine the SQL query

Decision tree for query type:

Patient demographics -> Simple SELECT with WHERE/GROUP BY
Biomarker expression -> SELECT gene columns with clinical filters
Survival correlation -> SELECT survival_status, survival_duration, biomarker columns
Cohort comparison -> GROUP BY with aggregation (COUNT, AVG)

Rules:

Write queries as single lines (no newlines)
Never modify column names from the schema
Use aggregation (COUNT, AVG, GROUP BY) to reduce output size
Always validate with refine_sql before execution

Tool: refine_sql
Input: sql (the query), question (rationale for this step -- not the user's original question)
Purpose: Optimize for efficiency, add aggregation, fix column references

Step 3: Execute and interpret results

Tool: query_redshift (or query_database)
Input: The refined SQL query
Output: Row-level results from the clinical database

Step 4: Build target profiles from external databases

For deeper biomarker validation, use the biomni-research MCP server with natural language queries:

Database	Query approach
UniProt	"CDK4 protein function, domains, post-translational modifications"
Open Targets	"CDK4 disease associations and genetic evidence scores"
STRING	"CDK4 protein-protein interaction network"
ClinVar	"CDK4 pathogenic variants and clinical significance"

Step 5: Rank candidates by evidence strength

Scoring framework for biomarker prioritization:

Evidence type	Weight	Source
Statistical significance (p < 0.05)	High	Cox regression from clinical data
Known pathogenic association	High	ClinVar, Open Targets
Protein interaction in disease network	Medium	STRING (confidence > 0.7)
Literature support (3+ publications)	Medium	PubMed
Gene expression differential	Medium	Clinical database
Functional annotation match	Low	UniProt

Query Patterns

Find top biomarkers for survival:

SELECT survival_status, survival_duration, gdf15, lrig1, cdh2, postn, vcan FROM clinical_genomic WHERE chemotherapy = 'Yes'

Cohort demographics:

SELECT smoking_status, COUNT(DISTINCT case_id) AS num_patients FROM clinical_genomic WHERE age_at_histological_diagnosis > 50 GROUP BY smoking_status

Disease-specific expression:

SELECT survival_status, COUNT(*) AS count FROM clinical_genomic WHERE histology = 'Adenocarcinoma' GROUP BY survival_status

Key Conventions

Map survival_status: False/Alive = 0, True/Dead = 1
Expression values are continuous (higher = more expressed in tumor)
Always include quality filters and use parameterized queries when available
Store query results in shared memory for downstream agents (statistician, pathway analyst)
When results exceed 100 rows, summarize with aggregation before presenting to user

biomarker-database-analysis

Biomarker Database Analysis

When to use this skill

MCP Servers Used

Workflow: Query Clinical Genomic Database

Step 1: Understand the schema before querying

Step 2: Formulate and refine the SQL query

Step 3: Execute and interpret results

Step 4: Build target profiles from external databases

Step 5: Rank candidates by evidence strength

Query Patterns

Key Conventions

More from this repository

Biomarker Database Analysis

When to use this skill

MCP Servers Used

Workflow: Query Clinical Genomic Database

Step 1: Understand the schema before querying

Step 2: Formulate and refine the SQL query

Step 3: Execute and interpret results

Step 4: Build target profiles from external databases

Step 5: Rank candidates by evidence strength

Query Patterns

Key Conventions

More from this repository