name	multimodal-medical-imaging
description	Analyzes medical images (X-ray, MRI, CT) using multimodal LLMs to identify anomalies and generate reports.

Multimodal Medical Imaging Analysis

The Multimodal Medical Imaging Analysis Skill leverages state-of-the-art Vision-Language Models (VLMs) like Gemini 1.5 Pro and GPT-4o to interpret medical imagery alongside clinical text.

When to Use This Skill

When you need a preliminary screening of medical images.
When correlating visual findings with textual clinical notes.
To generate structured reports (DICOM-SR-like) from raw images.

Core Capabilities

Anomaly Detection: Identify potential pathologies in X-rays, CTs, etc.
Report Generation: Draft radiology reports in standard formats.
VQA (Visual Question Answering): Answer specific questions about an image (e.g., "Is there a fracture in the left femur?").

Workflow

Input: Provide an image file path (JPG, PNG) and a specific clinical question or "generate report" instruction.
Analyze: The agent sends the image and prompt to the VLM.
Output: Returns a JSON object with findings, confidence scores, and reasoning.

Example Usage

User: "Analyze this chest X-ray for pneumonia."

Agent Action:

python3 Skills/Clinical/Medical_Imaging/Multimodal_Analysis/multimodal_agent.py \
    --image "/path/to/cxr.jpg" \
    --prompt "Check for signs of pneumonia and consolidation."

name	multimodal-medical-imaging
description	Analyzes medical images (X-ray, MRI, CT) using multimodal LLMs to identify anomalies and generate reports.

Multimodal Medical Imaging Analysis

The Multimodal Medical Imaging Analysis Skill leverages state-of-the-art Vision-Language Models (VLMs) like Gemini 1.5 Pro and GPT-4o to interpret medical imagery alongside clinical text.

When to Use This Skill

When you need a preliminary screening of medical images.
When correlating visual findings with textual clinical notes.
To generate structured reports (DICOM-SR-like) from raw images.

Core Capabilities

Anomaly Detection: Identify potential pathologies in X-rays, CTs, etc.
Report Generation: Draft radiology reports in standard formats.
VQA (Visual Question Answering): Answer specific questions about an image (e.g., "Is there a fracture in the left femur?").

Workflow

Input: Provide an image file path (JPG, PNG) and a specific clinical question or "generate report" instruction.
Analyze: The agent sends the image and prompt to the VLM.
Output: Returns a JSON object with findings, confidence scores, and reasoning.

Example Usage

User: "Analyze this chest X-ray for pneumonia."

Agent Action:

python3 Skills/Clinical/Medical_Imaging/Multimodal_Analysis/multimodal_agent.py \
    --image "/path/to/cxr.jpg" \
    --prompt "Check for signs of pneumonia and consolidation."

multimodal-medical-imaging

Multimodal Medical Imaging Analysis

When to Use This Skill

Core Capabilities

Workflow

Example Usage

More from this repository

More from this repository

Multimodal Medical Imaging Analysis

When to Use This Skill

Core Capabilities

Workflow

Example Usage