Skip to main content
Run any Skill in Manus
with one click

blip-2-vision-language

Stars9,996
Forks745
UpdatedNovember 25, 2025 at 23:28

Vision-language pre-training framework bridging frozen image encoders and LLMs. Use when you need image captioning, visual question answering, image-text retrieval, or multimodal chat with state-of-the-art zero-shot performance.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

File Explorer
3 files
SKILL.md
readonly