一键在 Manus 中运行任何 Skill

$pwd:

llm-dna-knowledge

Name: Llm Dna Knowledge
Author: serithemage

// Use when working on LLM model lineage, provenance, phylogenetic analysis, from-scratch verification, the project's verification targets (Solar-Open-100B, K-EXAONE, A.X-K1, HyperCLOVAX), reference families (Llama 3, Qwen 2.5, GLM, Mixtral, DeepSeek), or anything under experiments/llm-dna/. Routes through the project wiki/ before external search.

在 Manus 中运行

$ git log --oneline --stat

stars:47

forks:5

updated:2026年4月30日 21:48

SKILL.md

readonly

name	llm-dna-knowledge
description	Use when working on LLM model lineage, provenance, phylogenetic analysis, from-scratch verification, the project's verification targets (Solar-Open-100B, K-EXAONE, A.X-K1, HyperCLOVAX), reference families (Llama 3, Qwen 2.5, GLM, Mixtral, DeepSeek), or anything under experiments/llm-dna/. Routes through the project wiki/ before external search.

llm-dna-knowledge

Project-local convention: always consult the curated wiki/ knowledge base (28 cross-linked pages) before answering or acting on LLM lineage and verification topics. The wiki is compiled from primary sources; rebuilding context from scratch each session wastes tokens and produces contradictions.

When to Use

Questions about LLM-DNA, neighbor-joining, UPGMA, distance metrics, functional fingerprint
Verification methodology: tokenizer / weight / architecture / behavior analysis
The 3 dokpamo targets: Solar-Open-100B, K-EXAONE-236B, A.X-K1
Reference clusters: Llama, Qwen, GLM, Mixtral, DeepSeek (in lineage context)
Edits or scripts under experiments/llm-dna/
"from-scratch" controversy, Phase 1/2 evaluations, Stanford AI Index entries

Skip when the project has no wiki/ directory — fall back to external sources only.

Workflow

Read wiki/index.md first. It groups all 28 pages by type (concept / entity / methodology / infra / policy). Match the user's keywords to 3–5 candidate pages.
2-hop expansion via [[wiki-link]]. Open each candidate, follow inline links one or two levels deep until you have full context. Example: "Solar from-scratch?" → solar-open-100b → [[from-scratch-debate]] + [[layernorm-fingerprint-fallacy]] + [[glm-family]].
Cite using [[page-title]]. Reserve plain markdown links for external sources only.
Wiki not enough? Say "위키에 없음 — 외부 자료 보강" explicitly. If the new content has lasting value, drop it under wiki/raw/YYYY-MM-DD_<source>.md and offer to ingest via documentation:llm-wiki.

Quick Reference

Need	Page(s)
LLM-DNA core	`llm-dna-overview`, `dna-extraction-pipeline`, `random-projection`, `inheritance-and-determinism`
Tree construction	`phylogenetic-tree`, `neighbor-joining`, `upgma`, `distance-metrics`
Target models	`solar-open-100b`, `k-exaone-236b`, `ax-k1`
Reference families	`llama-3-family`, `qwen-25-family`, `glm-family`, `mixtral-8x7b`, `deepseek-v25`
Methodology	`tokenizer-analysis`, `weight-analysis`, `architecture-analysis`, `behavior-analysis`, `model-provenance-testing`, `layernorm-fingerprint-fallacy`
Infra	`sagemaker-spot-training`, `aws-cdk-typescript`, `huggingface-hub-usage`
Policy / context	`dokpamo-project`, `from-scratch-debate`

Q&A vs Wiki

The project runs both docs/tutorial/ (chronological Q&A learning log) and wiki/ (compiled best understanding). Flow on a new question: wiki check → answer → tutorial Q&A append → wiki update if non-trivial. Drives both forward without duplication.

Content vs Operational — sister skill `llm-dna-extraction-playbook`

This skill is for content questions (what is X, where is Y in the wiki, how does NJ tree work). For operational work — submitting/debugging SageMaker training jobs, modifying dna_train.py/submit_dna.py/analyze.py, planning Phase 5/6 retries, cost analysis — defer to llm-dna-extraction-playbook. That sister skill encodes 9 architectural lessons from Phase 4–6 ($34 learning curve cost), CLAUDE.md "인프라 운영 규칙" 9 rules, and entrypoint patches required for correct llm-dna 0.2.x behavior. Do not answer "should I run job X with config Y" from this skill alone — route through the playbook.

Wiki Maintenance Triggers

Offer the user a wiki update when:

A verification result is finalized (e.g., LLM-DNA tree analysis produces a conclusion)
A new reference model joins the lineup
A policy/definition shifts (e.g., "from-scratch" criteria)
An external source is genuinely worth keeping (drop into wiki/raw/, then llm-wiki ingest)

Common Mistakes

Searching externally before reading wiki/index.md — ignores 28 pages of compiled knowledge
Reading one page only, skipping [[link]] expansion — answers miss critical caveats (e.g., LayerNorm fingerprint fallacy)
Editing wiki/index.md by hand — overwritten by the next llm-wiki sync
Modifying files in wiki/raw/ — raw zone is immutable; new info goes in a new file
Concluding a verification but not updating the wiki — next session repeats the external lookup
Citing a page whose frontmatter says confidence: low without flagging that uncertainty in the answer

related-skills.json

同仓库

llm-dna-extraction-playbook.md

from "serithemage/korea-ai-foundation-model-verification"

Use when planning, submitting, or debugging LLM-DNA fingerprint extraction jobs on SageMaker spot training. Trigger on keywords like "LLM-DNA jobs", "calc-dna", "submit_dna.py", "SageMaker spot", "Solar / K-EXAONE / A.X-K1 / EXAONE 추출", "Phase 5/6 재시도", "model fingerprint extraction", or follow-up verification rounds. Codifies 9 cost/operational lessons from Phase 4–6 of korea-ai-foundation-model-verification (2026-04-30, $64 / 77 jobs, 52.8% waste rate) so future rounds avoid the same learning curve.

2026-04-3047

karpathy-guidelines.md

from "serithemage/korea-ai-foundation-model-verification"

Use when writing, reviewing, or refactoring code in this repo to reduce common LLM coding mistakes — overcomplication, drive-by edits to unrelated lines, hidden assumptions, vague success criteria. Four principles: Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution.

2026-04-3047

update-changelog.md

from "serithemage/korea-ai-foundation-model-verification"

Use when the user invokes /update-changelog, or when a session has produced concrete repository changes (file edits, verification results, methodology updates, structural changes) that should be recorded in CHANGELOG.md by date and category.

2026-04-3047

update-tutorial.md

from "serithemage/korea-ai-foundation-model-verification"

Use when the user invokes /update-tutorial, or when a verification-related Q&A exchange has just completed in the session and should be persisted into docs/tutorial/ as narrative-style content. Classifies by topic, finds the right file, inserts above the SECTION_MARKER, and updates the README index.

2026-04-3047

package.json

"author": "serithemage"

"repository": "serithemage/korea-ai-foundation-model-verification"

打开 GitHub 仓库查看创作者相关仓库

$ install --global

$ download --local

在 Manus 中运行

$ useful --forSOC

数据科学家计算机与数学类职业15-2051L4

name	llm-dna-knowledge
description	Use when working on LLM model lineage, provenance, phylogenetic analysis, from-scratch verification, the project's verification targets (Solar-Open-100B, K-EXAONE, A.X-K1, HyperCLOVAX), reference families (Llama 3, Qwen 2.5, GLM, Mixtral, DeepSeek), or anything under experiments/llm-dna/. Routes through the project wiki/ before external search.

llm-dna-knowledge

When to Use

Questions about LLM-DNA, neighbor-joining, UPGMA, distance metrics, functional fingerprint
Verification methodology: tokenizer / weight / architecture / behavior analysis
The 3 dokpamo targets: Solar-Open-100B, K-EXAONE-236B, A.X-K1
Reference clusters: Llama, Qwen, GLM, Mixtral, DeepSeek (in lineage context)
Edits or scripts under experiments/llm-dna/
"from-scratch" controversy, Phase 1/2 evaluations, Stanford AI Index entries

Skip when the project has no wiki/ directory — fall back to external sources only.

Workflow

Read wiki/index.md first. It groups all 28 pages by type (concept / entity / methodology / infra / policy). Match the user's keywords to 3–5 candidate pages.
2-hop expansion via [[wiki-link]]. Open each candidate, follow inline links one or two levels deep until you have full context. Example: "Solar from-scratch?" → solar-open-100b → [[from-scratch-debate]] + [[layernorm-fingerprint-fallacy]] + [[glm-family]].
Cite using [[page-title]]. Reserve plain markdown links for external sources only.
Wiki not enough? Say "위키에 없음 — 외부 자료 보강" explicitly. If the new content has lasting value, drop it under wiki/raw/YYYY-MM-DD_<source>.md and offer to ingest via documentation:llm-wiki.

Quick Reference

Need	Page(s)
LLM-DNA core	`llm-dna-overview`, `dna-extraction-pipeline`, `random-projection`, `inheritance-and-determinism`
Tree construction	`phylogenetic-tree`, `neighbor-joining`, `upgma`, `distance-metrics`
Target models	`solar-open-100b`, `k-exaone-236b`, `ax-k1`
Reference families	`llama-3-family`, `qwen-25-family`, `glm-family`, `mixtral-8x7b`, `deepseek-v25`
Methodology	`tokenizer-analysis`, `weight-analysis`, `architecture-analysis`, `behavior-analysis`, `model-provenance-testing`, `layernorm-fingerprint-fallacy`
Infra	`sagemaker-spot-training`, `aws-cdk-typescript`, `huggingface-hub-usage`
Policy / context	`dokpamo-project`, `from-scratch-debate`

Q&A vs Wiki

Content vs Operational — sister skill `llm-dna-extraction-playbook`

Wiki Maintenance Triggers

Offer the user a wiki update when:

A verification result is finalized (e.g., LLM-DNA tree analysis produces a conclusion)
A new reference model joins the lineup
A policy/definition shifts (e.g., "from-scratch" criteria)
An external source is genuinely worth keeping (drop into wiki/raw/, then llm-wiki ingest)

Common Mistakes

Searching externally before reading wiki/index.md — ignores 28 pages of compiled knowledge
Reading one page only, skipping [[link]] expansion — answers miss critical caveats (e.g., LayerNorm fingerprint fallacy)
Editing wiki/index.md by hand — overwritten by the next llm-wiki sync
Modifying files in wiki/raw/ — raw zone is immutable; new info goes in a new file
Concluding a verification but not updating the wiki — next session repeats the external lookup
Citing a page whose frontmatter says confidence: low without flagging that uncertainty in the answer

llm-dna-knowledge

llm-dna-knowledge

When to Use

Workflow

Quick Reference

Q&A vs Wiki

Content vs Operational — sister skill llm-dna-extraction-playbook

Wiki Maintenance Triggers

Common Mistakes

同仓库更多 Skills

同仓库更多 Skills

llm-dna-knowledge

When to Use

Workflow

Quick Reference

Q&A vs Wiki

Content vs Operational — sister skill llm-dna-extraction-playbook

Wiki Maintenance Triggers

Common Mistakes

Content vs Operational — sister skill `llm-dna-extraction-playbook`

Content vs Operational — sister skill `llm-dna-extraction-playbook`