Run any Skill in Manus with one click

$pwd:

medium-sub

Name: Medium Sub
Author: beatai-org

// Use this skill when the user wants to fetch Medium's "Recommended" article list for a configured set of tags. Trigger on requests like "拉取 Medium 推荐文章", "fetch medium recommended for tags", "每天抓一次 medium tag 推荐", or any request that mentions Medium tag recommended pages. Pure fetcher — emits a single JSON document to stdout. Does NOT write files, apply per-day limits, or do cross-day dedup; those concerns belong to the caller (e.g. material-pipeline). Reuses the chrome-profile from the medium-fetch skill (no separate login required).

Run Skill in Manus

$ git log --oneline --stat

stars:4,675

forks:256

updated:May 28, 2026 at 07:52

File Explorer

4 files

SKILL.md

readonly

related-skills.json

same repository

x-fetch.md

from "beatai-org/beatai"

Use this skill when the user wants to download / 抓取 / 下载 a tweet (single or thread) from X / Twitter by URL. Trigger on requests like "抓取这条推 <url>", "下载 X 帖文", "fetch this tweet", "把这个 X 链接拉下来", or any URL pointing to `x.com/<user>/status/<id>` or `twitter.com/<user>/status/<id>`. Uses a persistent real-Chrome profile (separate from medium-fetch) and walks the thread of the OP author, outputting Markdown + assets to RAW_DIR (caller-provided; required, no default). Does NOT translate, summarize, or decide where output goes — those are orchestrator concerns. Preserves the tweet's original language verbatim.

2026-05-294.7k

material-pipeline.md

from "beatai-org/beatai"

Use this skill for the daily Medium article "selection → download → translate → publish" pipeline. Trigger phrases: "跑一遍今日素材" / "material pipeline" / "批量抓 medium" / "medium 流水线" / "日更素材". The optional publish step has its own triggers: "发布到 ai-insights" / "publish ai-insights" / "publish".

2026-05-294.7k

translate.md

from "beatai-org/beatai"

Use this skill when the user / a caller asks to translate or rewrite English Markdown articles into Chinese. Two modes: **原文翻译模式** (default) preserves paragraph/list/heading 1:1 with the source; **原文重写模式** restructures along Chinese tech-blog conventions (use only when caller explicitly opts in or user says "按中文重写"). Trigger on requests like "翻译英文文章", "把英文文章翻成中文", "翻译这些 .md", "用中文重写", "按中文阅读习惯写", or a fully-spec'd handoff prompt with slugs + input_root + output_root + date (+ optional mode). This is a **leaf executor**—it owns single-article quality (忠实通顺、人称代词约束、保留英文白名单、领域专名一致性). Path/dir setup, frontmatter shape, image-ref rewriting, and post-translation self-checks are all delegated to scripts/translate-prepare.mjs + scripts/translate-verify.mjs. Caller (typically material-pipeline) decides slugs / input_root / output_root / date / mode — if any required field is missing, stop and ask. Does not register results into site navigation; does not orchestrate batches.

2026-05-294.7k

extract-excerpt.md

from "beatai-org/beatai"

Use this skill to fill the `excerpt` field in translated Chinese Markdown articles when the article had no subtitle. Trigger on requests like "提取 excerpt", "补 excerpt", "extract excerpt", "给文章补 excerpt", or a fully-spec'd handoff prompt with slugs + target_root. This is a **leaf executor**—it owns the semantic judgment of "which candidate paragraph is the real opening body text" but **does NOT decide paths**: target_root comes from the caller. If the caller didn't provide it, stop and ask. Does not translate, does not register results into site navigation, does not orchestrate batches.

2026-05-284.7k

medium-fetch.md

from "beatai-org/beatai"

Use this skill when the user wants to download / 抓取 / 下载 a Medium article (including paid member-only stories) by URL. Trigger on requests like "抓取这篇 Medium 文章 <url>", "下载 medium 文章", "fetch this Medium article", "把这个 Medium 链接拉下来", or any URL pointing to medium.com / *.medium.com / publication custom domains (levelup.gitconnected.com, towardsdatascience.com, betterprogramming.pub, uxdesign.cc, hackernoon.com 等). Uses a persistent real-Chrome profile for member authentication, automatically routes custom-domain URLs through Medium's cross-domain SSO bridge, and outputs Readability-cleaned Markdown + assets to RAW_DIR (caller-provided; required, no default). Does NOT translate, summarize, or decide where output goes — those are orchestrator concerns.

2026-05-284.7k

substack-fetch.md

from "beatai-org/beatai"

Use this skill when the user wants to download / 抓取 / 下载 a Substack article by URL. Trigger on requests like "抓取这篇 Substack 文章 <url>", "下载 substack 文章", "fetch this Substack article", "把这个 substack 链接拉下来", or any URL pointing to `*.substack.com/p/<slug>` or a Substack custom domain (e.g. `blog.dailydoseofds.com/p/...`, `www.oneusefulthing.org/p/...`). Plain HTTP fetch (no Chrome / no login required for free posts); outputs Readability-cleaned Markdown + assets to RAW_DIR (caller-provided; required, no default). Does NOT translate, summarize, or decide where output goes — those are orchestrator concerns.

2026-05-284.7k

package.json

"author": "beatai-org"

"repository": "beatai-org/beatai"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	medium-sub
description	Use this skill when the user wants to fetch Medium's "Recommended" article list for a configured set of tags. Trigger on requests like "拉取 Medium 推荐文章", "fetch medium recommended for tags", "每天抓一次 medium tag 推荐", or any request that mentions Medium tag recommended pages. Pure fetcher — emits a single JSON document to stdout. Does NOT write files, apply per-day limits, or do cross-day dedup; those concerns belong to the caller (e.g. material-pipeline). Reuses the chrome-profile from the medium-fetch skill (no separate login required).
version	2.0.0

Medium Tag Recommended 抓取器（纯 fetcher）

抓取 https://medium.com/tag/<tag>/recommended 的候选文章列表，跨 tag 去重后按 tag 分组，输出到 stdout 的单个 JSON 对象。

本 skill 是一个纯 fetcher：它只负责"拉文章列表"。

适用场景

给定一组 Medium tag（如 ai, llm, chatgpt），希望得到每个 tag 下的推荐候选数组
上层编排器（material-pipeline / 自定义 pipeline / 临时脚本）拿到 stdout JSON 后自行去重 / 应用 limit / 持久化

不适用

抓取单篇文章正文 → 用 medium-fetch skill
抓取 RSS / 最新（latest）而非 recommended → 本 skill 仅抓 /recommended
按日期归档 / 写文件 → 本 skill 不再写任何文件；要持久化用 shell 重定向（> file.json）或由调用方自行写
应用每日 limit / 做跨日去重 → 这是调用方的责任；本 skill 只对每个 tag 应用 --max-per-tag 候选池上限（防御性）

输出契约

stdout：单个 JSON 对象。任何其他内容都不会写到 stdout。

{
  "fetchedAt": "2026-05-13",                       // hint only, not authoritative
  "generator": "medium-sub@2.0.0",
  "source":    "https://medium.com/tag/<tag>/recommended",
  "stats": {
    "totalArticles":  14,
    "uniqueArticles": 14,
    "tagsFetched":    3,
    "tagsFailed":     0
  },
  "groups": [
    {
      "tag": "ai",
      "articles": [
        {
          "title":       "The Agent Harness…",
          "url":         "https://medium.com/@MongoDB/the-agent-harness-…-bce68414ccfd",
          "publishedAt": "2026-05-12",
          "author":      "MongoDB"
        }
      ]
    }
  ]
}

stderr：所有进度日志（"📥 ai → ...", "✓ ai: N 篇", "✗ cloudflare-blocked", chrome 启动信息等）。

字段说明：

url — 已剥离 ?source=tag_recommended_stories_page… 等追踪参数的纯路径 URL
publishedAt — 解析后的 YYYY-MM-DD；"2d ago" → 抓取日 - 2 天，"Mar 12, 2024" → 直接 parse；解析失败 fallback 为 fetchedAt
author — 卡片上的作者署名；解析失败留空字符串
每个 tag 内已先按 url 去重；跨 tag 时第一次出现的 tag 保留该文章（first-seen wins），后出现的 tag 跳过它

前置依赖

系统已安装 Google Chrome
Node.js ≥ 18
兄弟 skill medium-fetch 已经登录过一次（即 .claude/skills/medium-fetch/scripts/chrome-profile/ 存在）。recommended 列表页本身不需要会员，但仍需真实 Chrome 通过 Cloudflare 检测
首次使用本 skill 时需要 npm install

CLI

node fetch-list.js --tags ai,llm                    # 必需
node fetch-list.js --tags ai --max-per-tag 40       # 候选池上限，默认 40
node fetch-list.js --tags ai > /tmp/sub-list.json   # 持久化用 shell 重定向

只有两个 flag：

flag	含义	默认
`--tags ai,llm`	要抓的 tag slug 列表（逗号分隔）	必需
`--max-per-tag N`	每个 tag 候选池上限（防 stdout 过大）	40

环境变量：

CHROME_PROFILE_DIR — 覆盖默认 chrome-profile 路径（默认 ../../medium-fetch/scripts/chrome-profile）

工作流程

一次性 setup

cd /Users/sunfei/development/beatai/.claude/skills/medium-sub/scripts
npm install

确认 medium-fetch 已登录过：

ls /Users/sunfei/development/beatai/.claude/skills/medium-fetch/scripts/chrome-profile/

如不存在，先到 medium-fetch 跑一次 node login.js。

抓取（独立运行）

node fetch-list.js --tags ai,llm,chatgpt --max-per-tag 40 > sub-list.json

stderr 会打印 chrome 启动 + per-tag 进度；stdout 是干净 JSON。

抓取（被 material-pipeline 编排）

material-pipeline/scripts/run.js 内部 spawn 本脚本，捕获 stdout JSON 后在内存里做跨日去重 + 应用 limit，自己写 sub-list/<date>.json。详见 material-pipeline/SKILL.md。

去重规则（跨 tag，仅在 fetcher 内部做）

去重 key = <hostname.toLowerCase()><pathname>，去掉 ?query 与 #hash 与末尾 /。

例：以下两个 URL 视为同一篇：

https://medium.com/@MongoDB/the-agent-harness-bce68414ccfd?source=tag_recommended_stories_page------ai---0-…
https://medium.com/@MongoDB/the-agent-harness-bce68414ccfd?source=tag_recommended_stories_page------llm---2-…

第一次出现该 url 的 tag 保留它，后续 tag 跳过——保证每篇文章只属于一个 tag。这是 fetcher 内部的语义正确性，与跨日 dedup 正交（跨日 dedup 由调用方做）。

失败排查

现象	原因	处置
`✗ 未找到 chrome-profile`	medium-fetch 尚未登录过	到 medium-fetch 跑 `node login.js`
某个 tag `cloudflare-blocked`	profile 没有 clearance cookie	到 medium-fetch 重跑 `node login.js`，手动通过 Cloudflare 验证
某个 tag 0 篇	该 tag 不存在 / `<article>` 选择器超时	检查 tag slug 拼写；查看 stderr warning
`publishedAt` 全是抓取当天	日期文案不在白名单（外文、新格式）	当前 fallback 为今天；若长期不准可扩展 `parseDate`
同一篇出现在多个 tag 下	（不应发生）跨 tag 去重逻辑 bug	检查 `canonicalKey` 输出是否一致
stdout 不是合法 JSON	有 `console.log` 误写到 stdout	本脚本所有进度日志走 `console.error`；如有新增需求要严格走 stderr

这个 skill 不做的事

不下载正文 —— 只抓列表（标题/链接/日期/作者）。下载用 medium-fetch
不翻译/不辣评 —— 由其他 skill 在消费这份 JSON 时完成
不抓 latest / 自定义排序 —— 仅 /recommended
不写文件 —— v2.0.0 起 stdout 输出，要持久化用 shell 重定向或由调用方写
不应用每日 limit —— 只有 --max-per-tag 候选池上限；调用方自己 slice
不做跨日去重 —— 调用方扫描历史 output 目录后在内存里过滤
不读今日日期 / 不知道 archive 路径 —— fetchedAt 字段是 hint，调用方有权用自己的 date

文件清单

medium-sub/
├── SKILL.md                # 本文档
└── scripts/
    ├── package.json        # 仅依赖 playwright
    ├── fetch-list.js       # 主脚本（pure fetcher，stdout JSON）
    └── .gitignore          # 忽略 node_modules / package-lock.json

与其他 skill 的关系

                 medium-fetch              medium-sub  (此 skill)
                 ─────────────             ────────────────────
登录态来源         login.js (产出)    →    复用 chrome-profile（只读）
输入              单篇 URL                  --tags + --max-per-tag
输出              单篇 md + assets         stdout JSON（候选数组）
位置              <RAW_DIR>/<slug>/        （无文件输出；由调用方写——
                                           material-pipeline 写到
                                           .claude/skills/material-pipeline/
                                           materials/sub-list/<date>.json）

下游消费方典型流程：调用方 spawn 本脚本捕获 stdout → 解析 JSON → 在内存里过滤/聚合 → 选若干篇 → 调用 medium-fetch 下载正文 → 调用 translate 等。

v2.0.0 破坏性变更

相比 v1.x：

删除 flag：--date / --out / --limit / --exclude-slugs
删除：写 <OUT_DIR>/<date>.json 的逻辑、对 scripts/config.json 的兼容读取
新增：--max-per-tag（默认 40）
输出从写文件改为 stdout JSON
所有原 console.log(...) 进度日志改走 stderr

唯一已知的调用方 material-pipeline/scripts/run.js 已同步更新。如有外部脚本依赖旧契约，改成 node fetch-list.js --tags X > sub-list.json 即可获得旧的"写文件"效果（但文件 schema 与旧版 sub-list 不完全相同：缺 tagsConfig / limitPerTag 等只对调用方有意义的字段）。

medium-sub

More from this repository

Medium Tag Recommended 抓取器（纯 fetcher）

适用场景

不适用

输出契约

前置依赖

CLI

工作流程

一次性 setup

抓取（独立运行）

抓取（被 material-pipeline 编排）

去重规则（跨 tag，仅在 fetcher 内部做）

失败排查

这个 skill 不做的事

文件清单

与其他 skill 的关系

v2.0.0 破坏性变更

Medium Tag Recommended 抓取器（纯 fetcher）

适用场景

不适用

输出契约

前置依赖

CLI

工作流程

一次性 setup

抓取（独立运行）

抓取（被 material-pipeline 编排）

去重规则（跨 tag，仅在 fetcher 内部做）

失败排查

这个 skill 不做的事

文件清单

与其他 skill 的关系

v2.0.0 破坏性变更

More from this repository