| name | article-exporter |
| description | Export any web article to a local Obsidian-ready Markdown directory. Fetches page content via actionbook CLI, downloads images locally, rewrites image references to relative paths, and optionally translates the article using AI. Produces a self-contained folder with README.md, images/, and an index.md navigation file.
|
| when_to_use | Use when the user wants to save an online article, blog post, or social thread into Obsidian or a Markdown knowledge base. Also use when the user asks to archive, translate, or convert web content into clean Markdown with images downloaded locally.
|
| allowed-tools | ["Read","Write","Bash(actionbook:*)","Bash(curl:*)","Bash(obsidian-cli:*)"] |
| argument-hint | [output-dir] |
| arguments | ["output_dir"] |
Article Exporter - Export Articles to Obsidian
Version: 0.5.0 | Last Updated: 2026-03-13
You are an expert at web content archiving and Obsidian workflow automation.
Lessons from Failed Exports
These rules were extracted from real export failures. Each one prevents a specific class of error:
- Twitter/X needs AI reformatting —
fetch returns flat text because Twitter uses custom UI without semantic HTML. The AI reformatting step reconstructs headings, lists, and code blocks. See references/twitter-handling.md.
- Ask for output path first — users have different vault locations. Assuming a default creates files in the wrong place and wastes time moving them.
- Check actionbook version >= 0.9.1 — the
--wait-hint parameter was added in 0.9.1. Without it, dynamic content (SPAs, lazy-loaded pages) returns empty or partial results.
- Wait after navigation — use
--wait-hint heavy for Twitter, Medium, and other dynamic sites. Without it, the page hasn't finished rendering when content is extracted.
- Rate limit batch exports — 3-5s delay between requests prevents being flagged as a bot (ToS compliance).
Quick Reference
| Task | Command | Success Criteria |
|---|
| Check deps | actionbook --version | Shows version >= 0.9.1 |
| Fetch article | actionbook browser fetch <url> --wait-hint heavy | Returns plain text (AI reformats to Markdown in Step 1b) |
| Translate | AI session directly | README_CN.md created |
| Open in Obsidian | obsidian-cli open "path/index.md" | File opens in Obsidian |
Complete Export Workflow
Goal: Export web article to Obsidian directory with images and optional translation
Success criteria:
- Article directory created with README.md
- All images downloaded to images/
- index.md navigation file created
- Optional: README_CN.md translation
- Opened in Obsidian (if obsidian-cli available)
Step 1: Fetch Article Content
Execution: Direct (Bash)
actionbook browser fetch "$URL" --wait-hint heavy 2>/dev/null | \
sed '/^[[:space:]]*$/d;/^\x1b\[/d;/^INFO/d' > /tmp/article_raw.txt
Success criteria:
/tmp/article_raw.txt exists and size > 0 bytes
- Content contains the article's main text
The fetch command returns readability-extracted plain text (not Markdown).
AI reformatting in Step 1b is always needed to produce proper Markdown.
Rules:
- Use
--wait-hint heavy for Twitter, Medium, dynamic content
- Use
--wait-hint light for static blogs
2>/dev/null suppresses stderr logs
sed removes ANSI codes, INFO lines, empty lines
Twitter/X Special Handling
Twitter uses non-semantic HTML, so fetch output loses all structure (headings become flat text, code blocks disappear). If the URL contains x.com or twitter.com, pay extra attention to structure reconstruction in Step 1b. See references/twitter-handling.md.
Step 1b: AI Reformat to Markdown
Execution: Direct (AI session)
Read /tmp/article_raw.txt and convert the plain text into well-structured Markdown. Save the result to /tmp/article.md.
Reformatting rules:
- Reconstruct headings (
#, ##, ###) from the text structure
- Preserve original image URLs as
 references
- Format code blocks, lists, tables, and blockquotes
- Keep the original article title as the first
# H1 heading
Success criteria:
/tmp/article.md exists and starts with # <Title>
- Image URLs are preserved as Markdown image syntax
Step 2: Extract Metadata
Execution: Direct (Bash)
TITLE=$(grep -m 1 "^# " /tmp/article.md | sed 's/^# //')
IMAGE_URLS=$(grep -o '!\[[^]]*\]([^)]*)' /tmp/article.md | \
sed -E 's/!\[[^]]*\]\(([^)]*)\)/\1/' | \
grep -v '^data:')
Success criteria:
$TITLE is non-empty
$IMAGE_URLS count matches expected (use wc -l)
Step 3: Ask Output Directory
Execution: [human]
Human checkpoint: Confirm output location before creating files
Ask user: "Where should I save the exported article?"
Suggested paths:
~/Work/Write/Articles (default)
~/Documents/Obsidian/Articles
~/Notes/Imported
- (or custom path from
$output_dir argument)
Success criteria: User confirms output directory
Artifacts: $OUTPUT_DIR variable set
Step 4: Create Directory Structure
Execution: Direct (Bash)
OUTPUT_DIR="${output_dir:-$USER_CONFIRMED_PATH}"
SAFE_TITLE=$(echo "$TITLE" | sed 's/[/:*?"<>|]//g' | cut -c1-100 | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
ARTICLE_DIR="$OUTPUT_DIR/$SAFE_TITLE"
mkdir -p "$ARTICLE_DIR/images"
Success criteria:
- Directory
$ARTICLE_DIR exists
- Subdirectory
images/ exists
- Directory is writable
Rules:
- Remove special characters:
/ : * ? " < > |
- Limit title length to 100 characters
- Trim leading/trailing whitespace
Step 5: Download Images (Parallel if possible)
Execution: Direct (Bash)
counter=1
for url in $IMAGE_URLS; do
ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
curl -L -s "$url" -o "$ARTICLE_DIR/images/image_${counter}${ext}"
if [ ! -s "$ARTICLE_DIR/images/image_${counter}${ext}" ]; then
curl -L -s "${url}?format=jpg&name=orig" -o "$ARTICLE_DIR/images/image_${counter}.jpg"
fi
counter=$((counter + 1))
done
Success criteria:
- All image files exist and size > 0 bytes
- File count matches
$IMAGE_URLS count
Rules:
- Use
curl -L to follow redirects
- Check file size after download
- Try alternative formats for Twitter images
Step 6: Update Image References
Execution: Direct (Bash)
counter=1
for url in $IMAGE_URLS; do
ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
sed -i.bak "s|$url|./images/image_${counter}${ext}|g" /tmp/article.md
counter=$((counter + 1))
done
cp /tmp/article.md "$ARTICLE_DIR/README.md"
rm /tmp/article.md.bak
Success criteria:
README.md contains ./images/image_N.* references
- No remote URLs remain in image links
Step 7: AI Translation (Optional)
Execution: Direct (AI session)
Human checkpoint: Ask user: "Do you want to translate the article? (y/n)"
If yes:
- Read
$ARTICLE_DIR/README.md
- Translate using AI capabilities (no external API)
- Write to
$ARTICLE_DIR/README_CN.md (or other language code)
Translation Prompt Template:
Translate the following Markdown article to [LANGUAGE] while preserving:
- All Markdown formatting (headings, lists, code blocks, tables)
- Image references exactly as-is: 
- Links and URLs unchanged
- Code blocks and technical terms in original language
Only output the translated Markdown content.
---
[Paste README.md content]
Success criteria: Translation file exists and size ≈ original ± 20%
Supported languages: en, zh, es, fr, de, ja, ko
Step 8: Create Navigation Index
Execution: Direct (Bash)
case "$URL" in
*x.com*|*twitter.com*) SOURCE="X" ;;
*medium.com*) SOURCE="Medium" ;;
*dev.to*) SOURCE="Dev.to" ;;
*openai.com*) SOURCE="OpenAI Blog" ;;
*substack.com*) SOURCE="Substack" ;;
*github.com*) SOURCE="GitHub" ;;
*) SOURCE=$(echo "$URL" | sed 's|https\?://||' | cut -d/ -f1) ;;
esac
cat > "$ARTICLE_DIR/index.md" <<EOF
# $TITLE
> **Export Date**: $(date +%Y-%m-%d)
> **Original URL**: $URL
> **Source**: $SOURCE
## 📚 Language Versions
- 🇬🇧 **English**: [[README]]
- 🇨🇳 **Chinese**: [[README_CN]] <!-- if translated -->
## 📊 Metadata
| Property | Value |
|----------|-------|
| **Source** | $SOURCE |
| **Images** | $(ls images/ 2>/dev/null | wc -l) images |
| **Export Tool** | actionbook CLI |
| **Export Date** | $(date +%Y-%m-%d) |
---
**Exported using**: actionbook browser automation + AI assistant
EOF
Success criteria: index.md exists with metadata table
Step 9: Open in Obsidian
Execution: Direct (Bash)
if command -v obsidian-cli &> /dev/null; then
VAULT_ROOT="$OUTPUT_DIR"
REL_PATH=$(echo "$ARTICLE_DIR" | sed "s|$VAULT_ROOT/||")
obsidian-cli open "$REL_PATH/index.md"
echo "✓ Opened in Obsidian: $REL_PATH/index.md"
else
case "$(uname)" in
Darwin) open "$ARTICLE_DIR" ;;
Linux) xdg-open "$ARTICLE_DIR" ;;
CYGWIN*|MINGW*|MSYS*) start "$ARTICLE_DIR" ;;
esac
echo "⚠️ Install obsidian-cli for automatic opening: npm install -g obsidian-cli"
fi
Success criteria:
- File opens in Obsidian OR directory opens in file manager
- User sees success message
Step 10: Report Success
Execution: Direct (Output)
echo ""
echo "════════════════════════════════════════════"
echo "✓ Article exported successfully!"
echo ""
echo "📁 Location: $ARTICLE_DIR"
echo "📄 Files:"
echo " - README.md (original)"
[ -f "$ARTICLE_DIR/README_CN.md" ] && echo " - README_CN.md (translation)"
echo " - index.md (navigation)"
echo "🖼️ Images: $(ls images/ 2>/dev/null | wc -l) files"
echo "════════════════════════════════════════════"
Common Issues
| Issue | Cause | Solution |
|---|
| "actionbook: command not found" | CLI not installed | npm install -g @actionbookdev/cli@latest |
| "unknown flag: --wait-hint" | Version < 0.9.1 | Upgrade: npm install -g @actionbookdev/cli@latest |
| Twitter format broken | fetch loses structure | Use AI reformatting (see references/twitter-handling.md) |
| Images 0 bytes | URL expired | Try ?format=jpg&name=orig |
| obsidian-cli not found | Not installed | npm install -g obsidian-cli |
| Batch export blocked | Too fast, flagged as bot | Add 3-5s sleep between requests |
Detailed troubleshooting: See ./references/troubleshooting.md
Edge Cases Handled
- Long titles → Auto-truncate to 100 chars
- Special characters → Sanitized (
/ : * ? " < > | removed)
- No images → Steps 5-6 skip gracefully
- 0-byte images → Auto-retry with alternative formats
- Data URLs → Filtered out in Step 2
When Using This Skill
- Check dependencies first —
actionbook --version >= 0.9.1
- Test with one article — Verify before batch processing
- Twitter/X requires special handling — See references/twitter-handling.md
- Respect ToS — Personal use only, rate limit batch exports
References (Progressive Disclosure)
For detailed documentation, see:
./references/twitter-handling.md — Twitter/X special handling (AI reformatting)
./references/batch-export.md — Batch export with rate limiting
./references/troubleshooting.md — Detailed troubleshooting guide
./references/obsidian-setup.md — obsidian-cli setup and configuration
./references/supported-websites.md — Complete website compatibility list
Last Updated: 2026-03-13 | Version: 0.5.0