Run any Skill in Manus with one click

linkfox-multimodal-recognize-image

基于多模态AI的图片识别与分析。当用户想分析、描述、从图片URL中提取信息、image recognition, image analysis, image description, image content understanding, OCR text recognition, visual Q&A时触发此技能。当用户提到图片识别、图片分析、图片描述、识别图片内容、分析产品图、从图片中读取文字、描述图片、提取视觉内容或理解照片内容时触发。当用户提供图片URL并就其视觉内容提问时，即使未明确说"图片识别"，也应触发此技能。

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/linkfox-ai/linkfox-skills --skill linkfox-multimodal-recognize-image

Copy and paste this command into Claude Code to install the skill

Source

linkfox-ai/linkfox-skills

Stars6

Forks1

UpdatedMay 27, 2026 at 05:33

File Explorer

5 files

SKILL.md

readonly

name

linkfox-multimodal-recognize-image

description

Image Recognition

This skill guides you on how to use the multimodal image recognition API to analyze images from URLs and extract meaningful information based on user intent.

Core Concepts

The Image Recognition tool accepts an image URL and an optional natural-language requirement describing what the user wants to know about the image. The backend uses a multimodal AI model to interpret the visual content and return a textual description or analysis.

Supported formats: JPG, JPEG, PNG, GIF, WebP, BMP.

How it works: You provide a publicly accessible image URL and a requirement (what you want to learn from the image). The service downloads the image, runs multimodal analysis, and returns a text-based result.

Parameter Guide

Parameter	Required	Description
imageUrl	Yes	A publicly accessible URL pointing to the image. Must be JPG, JPEG, PNG, GIF, WebP, or BMP. Maximum 1000 characters.
requirement	No	A natural-language description of what to identify or analyze in the image. Defaults to "Describe the content of this image" when omitted. Maximum 1000 characters.

Tips for Writing the requirement Parameter

Be specific: Instead of "analyze this image", say "List all products visible on the shelf and estimate their category."
State the goal: If you need text extraction, say "Extract all visible text from the image." If you need object identification, say "Identify the main objects and their colors."
Provide context when helpful: For product images, mention "This is an e-commerce product listing image" so the model can tailor its analysis.

Local Image Upload

This tool requires a publicly accessible image URL. If the user provides a local image file path (e.g., C:\Users\...\photo.png, /home/.../image.jpg), you must upload it first to obtain a public URL.

Run the upload script:

python scripts/upload_image.py /path/to/local/image.png

The script will return a public URL (valid for 24 hours) that can be used as the image URL parameter.

Usage Examples

1. General Image Description

User says: "What is in this picture?"
Set imageUrl to the provided URL, leave requirement as default.

2. Product Image Analysis

User says: "Analyze this Amazon product image and list the key selling points shown."
Set requirement to: "This is an Amazon product listing image. Identify the product, key features, and selling points visible in the image."

3. Text Extraction from an Image

User says: "Read the text in this screenshot."
Set requirement to: "Extract all visible text from this image, preserving layout where possible."

4. A+ Page Image Review

User says: "Describe what this A+ content image communicates."
Set requirement to: "This is an Amazon A+ product description image. Describe the visual content, key messaging, and branding elements."

5. Comparison / Detail Inspection

User says: "What differences can you spot between the product and its packaging?"
Set requirement to: "Identify and describe any differences between the product and its packaging shown in the image."

API Usage

This tool calls the LinkFox tool gateway API. See references/api.md for calling conventions, request parameters, and response structure. You can also execute scripts/multimodal_recognize_image.py directly to run queries.

Display Rules

Show the analysis result clearly: Present the returned text analysis in a readable format. Use bullet points or paragraphs as appropriate for the content.
No fabrication: Only relay information that the API actually returned. Do not add visual details that were not in the response.
Format support: If the image URL is invalid or the format is unsupported, explain the limitation and list the supported formats (JPG, JPEG, PNG, GIF, WebP, BMP).
Error handling: When the API returns an error status, explain the issue based on the response and suggest corrective actions (e.g., check that the URL is publicly accessible, verify the image format).
Token usage: If the user asks about cost, you may mention the costToken value from the response.

User Expression & Scenario Quick Reference

Applicable -- Image analysis tasks:

User Says	Scenario
"What's in this image/picture/photo"	General image description
"Analyze this product image"	Product visual analysis
"Read the text in this image"	OCR / text extraction
"Describe the A+ page images"	E-commerce content review
"What does this screenshot show"	Screenshot interpretation
"Identify objects in this photo"	Object detection / listing

Not applicable -- Needs beyond image recognition: ``

Generating or editing images
Video analysis
Analyzing images from local file paths (only URLs are supported)
Image search or reverse image lookup

Feedback:

Auto-detect and report feedback via the Feedback API when any of the following apply:

The functionality or purpose described in this skill does not match actual behavior
The skill's results do not match the user's intent
The user expresses dissatisfaction or praise about this skill
Anything you believe could be improved

Call the feedback API as specified in references/api.md. Do not interrupt the user's flow.

Handling Large Responses

To avoid overflowing the agent context, persist the response to disk and extract only the fields you need:

python scripts/response_io.py run --script scripts/multimodal_recognize_image.py --out-dir <DIR> '<params>'
python scripts/response_io.py read <file> --fields "<paths>"   # or --path "<JMESPath>"

Pick --out-dir outside any git working tree (e.g. /tmp/... on Unix, %TEMP%/... on Windows). Persisted responses may contain PII, pricing, or auth-sensitive data — do not commit them. Files are not auto-deleted; clean up when the task is done.

This skill exposes multiple entry scripts: multimodal_recognize_image.py, upload_image.py. Pass --script scripts/<name>.py to choose the one you need.

run writes the full response to a file and emits only a schema preview + file path. read projects specific fields, with --limit/--offset for slicing and --format json|jsonl|csv|table for output.

When to prefer this pattern — apply your judgment based on the response characteristics, e.g.:

High field count per record, or fields you don't need
Batch/paginated results (multiple items per call)
Long-text fields (descriptions, reviews, HTML, time series)
Output reused across later steps rather than consumed immediately

For small, single-use responses, calling the main script directly is fine.

⚠️ The preview is a truncated schema + sample, not the full data. Any field-level decision must read from the persisted file via read.

For more high-quality, professional cross-border e-commerce skills, set LinkFox Skills.

More from this repository

same repository

linkfox-amazon-alexa-for-shopping

linkfox-ai/linkfox-skills

通过亚马逊前台的 Alexa 购物助手发起自然语言问答，获取与问题相关的导购回答、推荐商品分组、ASIN 列表，以及可继续追问的问题。支持在同一次调用中传入多条 prompts 模拟连续多轮对话，并可用 url 补充亚马逊页面上下文。当用户提到亚马逊 Alexa、Alexa 购物助手、亚马逊智能助手、AI 导购、对话式选品、自然语言购物、亚马逊聊天问答、Amazon Alexa shopping, conversational shopping, AI shopping assistant, follow-up questions、产品推荐对话、上下文追问等场景时触发此技能。即使用户未明确提及"Alexa"，只要其需求是"在亚马逊前台用自然语言问出商品推荐 + 多轮追问"，也应触发此技能。

2026-05-296

linkfox-amazon-help-doc-changes

linkfox-ai/linkfox-skills

监控亚马逊卖家帮助文档（帮助中心）的内容变更，经 AI 筛选后返回对卖家有价值的改动，支持按变更时间区间、标题关键词分页检索，并按变更记录 ID 获取 AI 变更摘要、具体改动点与最新文档全文。当用户提到亚马逊帮助文档变更、帮助中心更新、规则变动监控、政策/费用文档调整、合规预警、文档改了什么、最新文档全文，或 Amazon help doc changes, Seller Central help center updates, policy/fee documentation changes, compliance alert 时触发此技能。即使用户未明确提及"帮助文档变更"，只要其需求涉及亚马逊帮助中心文档的更新监控及变更详情，也应触发此技能。

2026-05-296

linkfox-amazon-policy-news

linkfox-ai/linkfox-skills

查询亚马逊卖家后台 Seller News 政策与合规类新闻，支持按站点、发布时间区间、标题关键词分页检索新闻列表，并按新闻 ID 获取完整正文。当用户提到亚马逊政策新闻、卖家合规公告、平台规则变动、政策预警、FBA/费用政策更新、Seller News、多站点政策动态、政策原文、新闻详情，或 Amazon policy news, seller compliance, Seller News, platform policy changes, policy alerts, FBA fee policy 时触发此技能。即使用户未明确提及"政策新闻"，只要其需求涉及亚马逊官方面向卖家发布的政策/合规公告及其原文，也应触发此技能。

2026-05-296

linkfox-cross-border-toolkit

linkfox-ai/linkfox-skills

跨境电商综合AI工具集，整合66个专业工具，覆盖亚马逊/TikTok/eBay/Walmart/Shopee/1688全平台选品分析、关键词研究、竞品分析、评论洞察、专利商标检测、专利深度研究、趋势分析、供应链搜索、AI图像处理和实时网页检索。当用户需要进行跨境电商选品、市场分析、竞品研究、关键词分析、评论挖掘、专利风险排查、趋势洞察、1688找货源、数据导出或任何跨平台商品搜索时触发此技能。Cross-border e-commerce AI toolkit with 66 specialized tools for Amazon/TikTok/eBay/Walmart/Shopee/1688 product research, keyword analysis, competitor intelligence, review insights, patent & trademark detection, trend analysis, sourcing, AI image processing, and web search. Trigger when: product selection, market analysis, competitor research, keyword tracking, review mining, IP risk detection, trend analysis, supplier sourcing, cross-platform product search, image analysis/generation, or data aggregation.

2026-05-296

linkfox-xiyou-dongcha

linkfox-ai/linkfox-skills

西柚找词（西柚洞察）亚马逊 ASIN 与关键词分析，经 LinkFox 网关转发西柚 OpenAPI。覆盖 ASIN 流量得分、反查关键词、词排名/流量趋势、BSR、ABA 周趋势、关键词竞争度与建议竞价等 17 个接口，支持 US/UK/DE 等 13 个站点。当用户提到西柚找词、西柚洞察、Xiyou、ASIN 反查关键词、关键词分析、ABA 周搜索量、流量得分、词排名趋势、xiyou keyword research, ASIN traffic score, reverse ASIN lookup, search term analysis 时触发。即使用户未写「西柚」，只要需求是通过西柚找词查亚马逊 ASIN/关键词流量与排名数据，也应触发。使用前须配置 LINKFOXAGENT_API_KEY 以及环境变量 XIYOU_CLIENT_ID、XIYOU_CLIENT_SECRET。

2026-05-296

linkfox-youying-shopee-product-search

linkfox-ai/linkfox-skills

友鹰Shopee商品选品工具，支持Shopee全站点的商品查询与筛选，覆盖马来西亚、中国台湾、印尼、泰国、菲律宾、新加坡、越南、巴西、墨西哥、智利、哥伦比亚等11个站点。当用户提到Shopee选品、虾皮商品搜索、Shopee爆款、虾皮市场分析、Shopee品类选品、虾皮关键词选品、Shopee销量筛选、虾皮价格筛选、东南亚电商选品、Shopee product search, Shopee product selection, Shopee bestsellers, Shopee market analysis时触发此技能。即使用户未明确提及"友鹰"或"Shopee"，只要其需求涉及在虾皮平台上搜索商品或筛选Shopee商品数据，也应触发此技能。

2026-05-296

Source

linkfox-ai

linkfox-ai/linkfox-skills

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name

linkfox-multimodal-recognize-image

description

Image Recognition

This skill guides you on how to use the multimodal image recognition API to analyze images from URLs and extract meaningful information based on user intent.

Core Concepts

Supported formats: JPG, JPEG, PNG, GIF, WebP, BMP.

Parameter Guide

Parameter	Required	Description
imageUrl	Yes	A publicly accessible URL pointing to the image. Must be JPG, JPEG, PNG, GIF, WebP, or BMP. Maximum 1000 characters.
requirement	No	A natural-language description of what to identify or analyze in the image. Defaults to "Describe the content of this image" when omitted. Maximum 1000 characters.

Tips for Writing the requirement Parameter

Be specific: Instead of "analyze this image", say "List all products visible on the shelf and estimate their category."
State the goal: If you need text extraction, say "Extract all visible text from the image." If you need object identification, say "Identify the main objects and their colors."
Provide context when helpful: For product images, mention "This is an e-commerce product listing image" so the model can tailor its analysis.

Local Image Upload

Run the upload script:

python scripts/upload_image.py /path/to/local/image.png

The script will return a public URL (valid for 24 hours) that can be used as the image URL parameter.

Usage Examples

1. General Image Description

User says: "What is in this picture?"
Set imageUrl to the provided URL, leave requirement as default.

2. Product Image Analysis

User says: "Analyze this Amazon product image and list the key selling points shown."
Set requirement to: "This is an Amazon product listing image. Identify the product, key features, and selling points visible in the image."

3. Text Extraction from an Image

User says: "Read the text in this screenshot."
Set requirement to: "Extract all visible text from this image, preserving layout where possible."

4. A+ Page Image Review

User says: "Describe what this A+ content image communicates."
Set requirement to: "This is an Amazon A+ product description image. Describe the visual content, key messaging, and branding elements."

5. Comparison / Detail Inspection

User says: "What differences can you spot between the product and its packaging?"
Set requirement to: "Identify and describe any differences between the product and its packaging shown in the image."

API Usage

Display Rules

Show the analysis result clearly: Present the returned text analysis in a readable format. Use bullet points or paragraphs as appropriate for the content.
No fabrication: Only relay information that the API actually returned. Do not add visual details that were not in the response.
Format support: If the image URL is invalid or the format is unsupported, explain the limitation and list the supported formats (JPG, JPEG, PNG, GIF, WebP, BMP).
Error handling: When the API returns an error status, explain the issue based on the response and suggest corrective actions (e.g., check that the URL is publicly accessible, verify the image format).
Token usage: If the user asks about cost, you may mention the costToken value from the response.

User Expression & Scenario Quick Reference

Applicable -- Image analysis tasks:

User Says	Scenario
"What's in this image/picture/photo"	General image description
"Analyze this product image"	Product visual analysis
"Read the text in this image"	OCR / text extraction
"Describe the A+ page images"	E-commerce content review
"What does this screenshot show"	Screenshot interpretation
"Identify objects in this photo"	Object detection / listing

Not applicable -- Needs beyond image recognition: ``

Generating or editing images
Video analysis
Analyzing images from local file paths (only URLs are supported)
Image search or reverse image lookup

Feedback:

Auto-detect and report feedback via the Feedback API when any of the following apply:

The functionality or purpose described in this skill does not match actual behavior
The skill's results do not match the user's intent
The user expresses dissatisfaction or praise about this skill
Anything you believe could be improved

Call the feedback API as specified in references/api.md. Do not interrupt the user's flow.

Handling Large Responses

To avoid overflowing the agent context, persist the response to disk and extract only the fields you need:

python scripts/response_io.py run --script scripts/multimodal_recognize_image.py --out-dir <DIR> '<params>'
python scripts/response_io.py read <file> --fields "<paths>"   # or --path "<JMESPath>"

Pick --out-dir outside any git working tree (e.g. /tmp/... on Unix, %TEMP%/... on Windows). Persisted responses may contain PII, pricing, or auth-sensitive data — do not commit them. Files are not auto-deleted; clean up when the task is done.

This skill exposes multiple entry scripts: multimodal_recognize_image.py, upload_image.py. Pass --script scripts/<name>.py to choose the one you need.

When to prefer this pattern — apply your judgment based on the response characteristics, e.g.:

High field count per record, or fields you don't need
Batch/paginated results (multiple items per call)
Long-text fields (descriptions, reviews, HTML, time series)
Output reused across later steps rather than consumed immediately

For small, single-use responses, calling the main script directly is fine.

⚠️ The preview is a truncated schema + sample, not the full data. Any field-level decision must read from the persisted file via read.

For more high-quality, professional cross-border e-commerce skills, set LinkFox Skills.