| name | blog-collector |
| description | Collect AI Agent Skills blog articles from WeChat, Zhihu, Medium and other platforms into the awesome-skills repository. Use when: (1) adding new blog articles to blogs/ directory, (2) collecting skill-related articles, (3) expanding the knowledge base. Triggers on: "collect blog", "add article", "scrape article", "blog collector", "collect article". The collected article goes into blogs/YYYY-MM-DD-article-slug.md with images in blogs/images/. |
Blog Collector
Collects blog articles from various platforms and saves them to the blogs/ directory.
Workflow
1. Receive Article URL
echo "Article URL to collect:"
2. Extract Content
For WeChat Articles:
openclaw browser open targetUrl:"https://mp.weixin.qq.com/s/xxxxx"
curl -s "https://r.jina.ai/https://mp.weixin.qq.com/s/xxxxx"
For Other Platforms:
curl -s "https://r.jina.ai/https://zhihu.com/xxxxx"
curl -s "https://r.jina.ai/https://medium.com/xxxxx"
curl -s "https://r.jina.ai/https://juejin.cn/xxxxx"
3. Extract and Download Images
openclaw browser act targetId:<tab_id> request:'{"kind": "evaluate", "fn": "(() => { return Array.from(document.querySelectorAll(\"img\")).filter(img => img.src && img.src.includes(\"mmbiz\")).map(img => ({src: img.src, alt: img.alt || \"\", width: img.naturalWidth})).slice(0, 20); })()"}'
mkdir -p blogs/images/YYYY-MM-DD-article-slug/
curl -s -L -o "blogs/images/YYYY-MM-DD-article-slug/cover.jpg" "<cover_url>"
4. Generate Markdown
cat > "blogs/YYYY-MM-DD-article-slug.md" << 'EOF'
> **作者**:Author Name | **来源**:Source Name | **发布日期**:YYYY-MM-DD
> **原文**:https://original-url.com
---

---
[Content here...]
- [原文链接](https://original-url.com)
EOF
5. Determine Article Type & Destination
| Type | Destination |
|---|
| 论文/学术 | papers/ |
| 技术博客 | blogs/ |
| 行业分析 | blogs/analysis/ |
| 工具介绍 | blogs/ |
6. Push to GitHub
cd /Volumes/waku/github-维护/awesome/awesome-skills-repos
git add blogs/
git add blogs/images/
git commit -m "Add: <article title> (YYYY-MM-DD)"
git push
Source Priority
- WeChat (mp.weixin.qq.com) - Most common for Chinese AI/Agent content
- Medium - High quality English articles
- Zhihu (zhihu.com) - Technical discussions
- 掘金 (juejin.cn) - Developer articles
- GitHub Blog - Official announcements
Image Filtering Rules
- ❌ Filter: Logo images (imgIndex=0 in WeChat URLs)
- ❌ Filter: Watermark/ads
- ✅ Keep: Technical diagrams, screenshots, charts
- ✅ Keep: Cover images
- Naming:
<platform>-<topic>-<number>.jpg
Known Blog Sources
| Source | URL Pattern | Quality |
|---|
| 沃垠AI | mp.weixin.qq.com (搜索) | ⭐⭐⭐⭐⭐ |
| 宝玉 | mp.weixin.qq.com (搜索) | ⭐⭐⭐⭐⭐ |
| 冷逸 | mp.weixin.qq.com (搜索) | ⭐⭐⭐⭐ |
| Agent 挖掘机 | mp.weixin.qq.com (搜索) | ⭐⭐⭐⭐ |
| Medium (unicodeveloper) | medium.com/@xxx | ⭐⭐⭐⭐⭐ |
Notes
- Always filter WeChat images (many are ads/logos)
- Use Jina Reader for clean content extraction
- Name files:
YYYY-MM-DD-article-slug.md
- Include original URL for reference
- Target: 3-5 articles per month