Run any Skill in Manus with one click

technews-webscrape

Stars0

Forks0

UpdatedMay 20, 2026 at 22:58

Configure and use reputable web scraping for TK TechNews. Use when an agent needs Firecrawl MCP, Firecrawl-backed scraping, dynamic page extraction, web source troubleshooting, or guidance on when to use RSS, local scraping, Firecrawl, or YouTube transcripts for cited article generation.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

Tyler-R-Kendrick

Tyler-R-Kendrick/tk-technews

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software DevelopersComputer and Mathematical Occupations·SOC 15-1252

SKILL.md

readonly

name	technews-webscrape
description	Configure and use reputable web scraping for TK TechNews. Use when an agent needs Firecrawl MCP, Firecrawl-backed scraping, dynamic page extraction, web source troubleshooting, or guidance on when to use RSS, local scraping, Firecrawl, or YouTube transcripts for cited article generation.

TechNews Webscrape

Use this skill from the repository root when a web source needs higher-quality extraction than the default local fetch plus cheerio scraper.

Preferred Scraper

Use Firecrawl as the reputable external scraper. The repo has:

.mcp.json for MCP-aware agents: npx -y firecrawl-mcp.
npm run scrape:firecrawl -- --url <url> for one-off extraction checks.
data/sources.json support for "scraper": "firecrawl" on web sources.

Workflow

Confirm FIRECRAWL_API_KEY is set before expecting MCP or Firecrawl scraping to work.
For one URL, run npm run scrape:firecrawl -- --url "https://example.com/article".
For configured sources, add "scraper": "firecrawl" to the source and run npm run ingest.
Inspect data/summaries/latest.json and data/raw/firecrawl/ for extraction quality.
Run npm run extract:knowledge, npm run draft -- --topic "...", and npm run validate:citations.

Selection Rules

Prefer RSS when the publisher offers a stable feed.
Use local scraping for simple static pages.
Use Firecrawl for dynamic pages, pages with heavy chrome, or pages where local extraction misses main content.
Use YouTube transcript ingestion for videos.
Do not use scraping to bypass paywalls, access controls, or publisher restrictions.

More from this repository

same repository

technews-youtube-mcp

Tyler-R-Kendrick/tk-technews

Use the local TK TechNews YouTube Data API MCP server. Trigger when an agent needs YouTube channel info, playlist info, playlist videos, video metadata, YouTube search, caption track listing, caption downloads, or VS Code MCP setup for the repo's local YouTube server.

2026-05-200

technews-youtube-transcript-mcp

Tyler-R-Kendrick/tk-technews

Use the local TK TechNews MCP server wrapping jdepoix/youtube-transcript-api. Trigger when an agent needs YouTube transcripts, generated subtitles, transcript language discovery, translation, SRT/VTT/text/JSON transcript output, or VS Code MCP setup for transcript extraction without a YouTube Data API key.

2026-05-200

technews-draft

Tyler-R-Kendrick/tk-technews

Draft cited TK TechNews articles from the normalized source summary ledger. Use when an agent needs to turn data/summaries/latest.json into a Markdown article, preserve citations in frontmatter, or create an initial explainer draft for editorial refinement.

2026-05-200

technews-durable-pipeline

Tyler-R-Kendrick/tk-technews

Run the durable TK TechNews source-to-article pipeline. Use when an agent needs to ingest one URI, create a cited source brief, enrich it into the temporal knowledge graph, aggregate enriched docs by day, or generate an Astro article from an aggregate brief.

2026-05-200

technews-publish

Tyler-R-Kendrick/tk-technews

Validate and publish TK TechNews static Astro content. Use when an agent needs to check citation integrity, run the Astro build, preview the site, or prepare a generated article for commit or deployment.

2026-05-200

technews-research

Tyler-R-Kendrick/tk-technews

Collect technology-news source material for TK TechNews. Use when an agent needs to fetch RSS feed items, scrape configured web pages, pull YouTube transcripts, refresh data/summaries/latest.json, or prepare source summaries before drafting an article in this repository.

2026-05-200

name	technews-webscrape
description	Configure and use reputable web scraping for TK TechNews. Use when an agent needs Firecrawl MCP, Firecrawl-backed scraping, dynamic page extraction, web source troubleshooting, or guidance on when to use RSS, local scraping, Firecrawl, or YouTube transcripts for cited article generation.

TechNews Webscrape

Use this skill from the repository root when a web source needs higher-quality extraction than the default local fetch plus cheerio scraper.

Preferred Scraper

Use Firecrawl as the reputable external scraper. The repo has:

.mcp.json for MCP-aware agents: npx -y firecrawl-mcp.
npm run scrape:firecrawl -- --url <url> for one-off extraction checks.
data/sources.json support for "scraper": "firecrawl" on web sources.

Workflow

Confirm FIRECRAWL_API_KEY is set before expecting MCP or Firecrawl scraping to work.
For one URL, run npm run scrape:firecrawl -- --url "https://example.com/article".
For configured sources, add "scraper": "firecrawl" to the source and run npm run ingest.
Inspect data/summaries/latest.json and data/raw/firecrawl/ for extraction quality.
Run npm run extract:knowledge, npm run draft -- --topic "...", and npm run validate:citations.

Selection Rules

Prefer RSS when the publisher offers a stable feed.
Use local scraping for simple static pages.
Use Firecrawl for dynamic pages, pages with heavy chrome, or pages where local extraction misses main content.
Use YouTube transcript ingestion for videos.
Do not use scraping to bypass paywalls, access controls, or publisher restrictions.