| name | archive-card-writer |
| description | Turn fragmented knowledge, rough notes, research snippets, interview takeaways, project learnings, paper excerpts, or partially formed ideas into structured archive cards for the Personal Blog archive system. Use this whenever the user asks to remember, 整理, 沉淀, 归档, 记到 archive, turn notes into cards, merge new knowledge into an existing archive note, or when they provide scattered points that should become a reusable archive card rather than a polished blog post. Also use this when new input should be matched against existing archive/blog content so similar notes are merged instead of duplicated. |
Archive Card Writer
This skill converts messy input into maintainable archive cards for the Personal Blog archive system.
The goal is not to over-polish everything into a blog article. The goal is to create or update a durable archive card that can keep growing.
Primary target
Write into:
/mnt/hermes-data/personal/projects/blog/src/content/archive/
This skill is used by the agent, but its content target is the Personal Blog archive.
Workflow
When triggered, do this sequence:
- Read the user's input carefully.
- If the input includes a website URL, shared conversation link, or a pointer to externally hosted content, fetch the actual content before summarizing or archiving it.
- For web pages and LLM conversation shares, prefer browser-use / browser-based retrieval first so you capture the rendered page rather than guessing from the title, snippet, or raw URL.
- If browser-based retrieval is blocked, fall back to simpler fetch methods, but explicitly note when the captured content is partial, login-gated, or inferred from metadata only.
- Treat URLs as references, not as sufficient content. Do not create an archive card from just a bare link unless the page content could not be retrieved and that limitation is made explicit.
- Decide whether the input should:
- create a new archive card, or
- merge into an existing archive card.
- Before writing, inspect existing archive cards for overlap in:
- topic
- tags
- title similarity
- repeated concepts
- same project / same learning thread
- If an existing archive card is clearly the same thread, append or integrate the new content there instead of creating a duplicate.
- If several archive cards together are obviously feeding a future blog post, capture that relationship in metadata or in a short body note.
- Write concise, readable Chinese-first content.
- Keep technical terms and data in English when that is more natural or precise.
- Create a dedicated git branch for this card workflow. Use the card filename slug (without
.md) as the branch name whenever practical.
- After writing into the blog project, run validation when practical:
bun run check
- or
bun run build
- optionally
bun run lint if the change touched app/page code rather than just content
- Note:
bun is installed at /home/ubuntu/.bun/bin/bun. If it is not in PATH in the current shell session, use the full path.
- Check Vercel deployment status using the Vercel CLI (already logged in):
- Run
vercel list to see the latest deployments for this project.
- If any deployment shows
Error status, investigate and fix the root cause (e.g., content schema violations, build failures).
- Common fix: Astro content collection schema errors like "title too long" or "description too long" — shorten the frontmatter fields and re-validate with
bun run check.
- Do not proceed to merge until the deployment status is
Ready or Production.
- Commit and push the branch.
- Open a PR to
main.
- Merge the PR after verification, so each archive card can ship independently and multiple archive cards can proceed in parallel.
- Do not stop at “file created locally” unless the user explicitly asks you not to continue. The default finish line for this skill is: content written → validated → deployed → committed → pushed → PR opened → merged.
Language and tone
- Default language: Chinese-first.
- Preserve technical terms, APIs, library names, model names, file paths, and data labels in English when useful.
- Match the style of the existing blog/archive content: clear, practical, not AI-corporate.
description is optional. If the input supports a concise summary, write one. If not, omit it.
File naming rules
Use:
- English slug
- month prefix only in
MMYY
- kebab-case
Examples:
0326-agent-tool-calling-guardrails.md
0326-rope-precision-notes.md
0326-react-cache-observations.md
Rules:
- keep the slug short
- prefer the main topic, not every subtopic
- do not add the day
- do not add Chinese characters to the filename
Archive schema expectations
Respect the archive frontmatter already used by the blog project.
Use these fields when relevant:
---
title:
description:
date:
updatedDate:
tags:
type:
status:
relatedBlog:
relatedArchive:
source:
draft: false
---
Field guidance
title
- clear and compact
- should sound like a knowledge card, not a clickbait article title
description
- optional
- write only if a short summary is obvious and useful
- keep it brief
date
- use the current date for newly created cards
updatedDate
- set when modifying an existing card
tags
- reuse existing tag vocabulary whenever possible
- normalize case and wording
- prefer a stable compact set over inventing near-duplicates
type
Choose one:
note
snippet
draft
idea
research
reference
status
Choose one:
in-progress
incomplete
ready
archived
Interpretation:
in-progress: still actively growing
incomplete: useful fragment, not yet coherent
ready: already structured and reusable
archived: mostly stable / reference-only
relatedBlog
Use when this card clearly supports one or more blog posts.
relatedArchive
Use when this card belongs to an existing archive thread or concept cluster.
source
Use only when the source is concrete and worth preserving:
- URL
- paper
- repo
- doc
- interview-note source
Canonical tag policy
This skill must enforce tag hygiene.
Existing tag reuse first
Always inspect existing archive/blog tags before inventing new ones.
Preferred canonical tag set
Start from the tags already seen in the project. Reuse these when relevant:
ai
agent
llm
prompt
rag
embedding
retrieval
security
multi-agent
orchestration
transformer
rope
normalization
react
frontend
typescript
performance
reference
software engineering
workflow
codex
images
astro
waline
deep learning
You may add a new tag only when it adds real retrieval value and no good existing tag fits.
Tag normalization rules
- prefer lowercase English tags for technical topics unless the project already uses a better Chinese tag
- do not create variants like:
LLM vs llm
front-end vs frontend
agent systems vs agent
- do not create tags that are too narrow to ever reuse
- when editing related cards, consolidate messy variants if you find them
Merge-vs-create decision rule
Before creating a new file, decide whether the input should merge.
Merge when
- the topic is the same ongoing learning thread
- the new fragment adds evidence, examples, nuance, or correction to an existing card
- the new fragment is a subpoint of an existing concept card
- the semantic overlap is high enough that two cards would feel redundant
Create when
- the topic is meaningfully distinct
- merging would make the existing card bloated or unfocused
- the user explicitly asks for a separate card
- the fragment starts a new line of inquiry that can grow independently
When uncertain, prefer merging if semantic overlap is high.
Merge procedure
If merging into an existing card:
- preserve the useful structure already present
- integrate the new information into the most relevant section
- create a subsection if needed
- update
updatedDate
- update tags if the new fragment adds an important retrievable theme
- add
relatedArchive links if this merge reveals nearby cards
Do not just dump raw text at the end unless the content is truly an appendix.
Auto-assimilation rule
If the user provides a new fragment and an older archive card already contains a clearly similar idea, this skill should prefer assimilation over duplication.
That means:
- find the most relevant existing archive card
- merge the new fragment into the existing structure
- only create a new card if the old card would become muddled or too broad
Archive-to-blog relationship rule
This skill should actively consider whether archive content may later become a blog post.
When to mark blog relationships
Use relatedBlog when there is already a concrete relevant blog post.
If no matching blog post exists yet, but several archive fragments clearly form a future article cluster:
- mention this in the card body briefly, or
- link neighboring archive cards via
relatedArchive
Example
If several archive cards all discuss:
- context engineering
- memory layers
- token budget
- retrieval strategy
then the skill should notice that these may later combine into a blog article.
Recommended body structure
Use a flexible archive-card structure. Do not force every section if it would feel fake.
Preferred template:
## 核心内容
## 要点整理
## 当前理解 / 结论
## 待补充
## 相关链接 / 来源
By type
snippet
Usually keep it compact:
research
Emphasize:
idea
Emphasize:
reference
Emphasize:
Handling very fragmented input
If the user's input is highly fragmented:
- reorganize it under the nearest useful headings
- condense repetition
- preserve the original insight
- do not over-invent content the user never implied
It is OK to add structure headings when the material is too碎.
URL and shared-conversation intake rule
For inputs like:
- website URLs
- tweet / thread / post links
- YouTube / podcast / article links
- shared LLM conversations (Grok / ChatGPT / Claude / similar)
follow this retrieval order:
- Use browser-based retrieval first when the content is likely rendered dynamically, hidden behind client-side hydration, or better represented visually than in raw HTML.
- Use simpler fetch/extraction only when browser retrieval is unnecessary or fails.
- If the page requires login and the content cannot be accessed, say so plainly and avoid pretending the title or preview text is the full source.
- When archiving a conversation link, capture the actual exchanged points, decisions, examples, and conclusions — not just the headline.
- If the retrieved content is incomplete, archive only what was actually observed and mark the limitation in the card body or
source context.
ChatGPT / Grok / Gemini share links — specific method
Share links from LLM platforms (ChatGPT, Grok, Gemini, Claude) render conversation content client-side via JavaScript. Simple HTTP fetch returns empty HTML.
Proven working method: Playwright + System Chromium
The server has system Chromium installed at /usr/bin/chromium-browser. Use Playwright directly to extract rendered page content:
from playwright.async_api import async_playwright
import asyncio
async def extract_share_link(url: str) -> str:
async with async_playwright() as p:
browser = await p.chromium.launch(
executable_path='/usr/bin/chromium-browser',
headless=True,
args=['--no-sandbox', '--disable-dev-shm-usage']
)
page = await browser.new_page()
await page.goto(url, wait_until="domcontentloaded", timeout=60000)
await page.wait_for_timeout(8000)
text = await page.locator('body').inner_text()
await browser.close()
return text
Key points:
wait_until="domcontentloaded" instead of networkidle — faster and sufficient for share links
wait_for_timeout(8000) — give JS time to render conversation content
- Extract
body inner text — gets all rendered conversation text
- Works for ChatGPT, Grok, Gemini share links (tested and verified)
What was tried and did NOT work (avoid repeating):
web_extract tool → returns empty content (JS-rendered pages)
curl with browser UA → gets shell HTML without conversation data
- ChatPeek (Python tool) → works but only for ChatGPT; Playwright is more general
- browser-use agent → requires LLM API key for decision-making; overkill for simple extraction
Always use Playwright + system Chromium first for any share link.
Expected behavior in conversation
When doing archive capture work:
- briefly say whether you are creating a new card or merging into an existing one
- mention the target file path
- if useful, mention chosen tags / type / status
- after writing, summarize what was captured
Examples
Example 1: merge
Input:
- “补充一下我昨天那个 tool calling 笔记:真正的问题不是死循环本身,而是 schema 写得不清楚导致模型误判。”
Action:
- find the existing tool-calling / agent reliability card
- merge the new point into the existing reasoning section
- update
updatedDate
Example 2: create new
Input:
- “我最近发现 RoPE 多频率机制里浮点精度这个点,值得单独记一张卡。”
Action:
- create a new archive card if no existing RoPE precision card exists
- likely use
research
- add tags like
llm, transformer, rope
Example 3: merge instead of duplicate
Input:
- “我又补充一点上下文工程的碎片:working memory / short-term / long-term 不应该只按时间分,还要按重要性和可恢复性分。”
Action:
- search for an existing context-engineering or memory-layer card
- if found, merge into it instead of creating a duplicate file
Markdown gotchas — critical
Astro/remark has specific rendering quirks with Chinese characters. These patterns leak raw markdown syntax into rendered HTML and must be avoided:
1. Bold + Chinese colon bug
BAD — ** immediately after fullwidth colon : is NOT rendered as <strong>:
**进一步优化:**最好再 fsync
→ Renders as literal **进一步优化:** on the page.
GOOD — move the colon outside the bold markers:
**进一步优化**:最好再 fsync
→ Renders as `进一步优化:最好再 fsync
Rule: **Chinese text**: ← colon AFTER closing **, never **Chinese text:** ← colon INSIDE markers.
This applies to all bold patterns where Chinese colon belongs to the bold text — just break it out.
2. Underscores in file/code references
memory_tool.py, created_at, updated_at — these work fine as plain text or inside backticks. Do NOT escape underscores with backslashes (memory\_tool.py).
3. Confirm rendering locally
When practical, check the rendered HTML locally or via the Vercel preview URL to verify bold markers and code references rendered correctly. The **Chinese text:** bug is invisible in the raw .md file — you need to see the HTML or the deployed page to catch it.
Constraints
- do not turn everything into a polished blog post
- do not create duplicate archive files when a merge is clearly better
- do not explode the tag set with one-off variants
- do not fabricate strong conclusions from weak fragments
- keep cards useful for future expansion