一键在 Manus 中运行任何 Skill

$pwd:

chat-tester

Name: Chat Tester
Author: ayia

// Interactive chat tester for Alysse. Has a REAL adaptive conversation with a character (40-50 messages SFW+NSFW), scores on 10 human-feel dimensions + 8 hard floors + memory deep-test, learns from past tests, and iterates fixes until READY. **Use when:** - Testing a character's chat quality - After modifying a character config - After any chat/memory/LLM code change - Validating a new or cloned character **Trigger phrases:** "test chat", "chat QA", "interactive test", "test valentina", "test the chat" **Examples:** <example> user: "test chat on valentina" assistant: "Launching chat-tester for interactive Valentina chat quality test." </example> <example> user: "/chat-tester amara-diallo en fr" assistant: "Testing Amara Diallo in English and French." </example>

在 Manus 中运行

$ git log --oneline --stat

stars:0

forks:0

updated:2026年5月23日 22:19

SKILL.md

readonly

related-skills.json

同仓库

impeccable.md

from "ayia/cakeia"

Use when the user wants to design, redesign, shape, critique, audit, polish, clarify, distill, harden, optimize, adapt, animate, colorize, extract, or otherwise improve a frontend interface. Covers websites, landing pages, dashboards, product UI, app shells, components, forms, settings, onboarding, and empty states. Handles UX review, visual hierarchy, information architecture, cognitive load, accessibility, performance, responsive behavior, theming, anti-patterns, typography, fonts, spacing, layout, alignment, color, motion, micro-interactions, UX copy, error states, edge cases, i18n, and reusable design systems or tokens. Also use for bland designs that need to become bolder or more delightful, loud designs that should become quieter, live browser iteration on UI elements, or ambitious visual effects that should feel technically extraordinary. Not for backend-only or non-UI tasks.

2026-05-230

runpod-vllm-deploy.md

from "ayia/cakeia"

Deploy or refresh a RunPod vLLM pod for Alysse production, mirroring the B3G dev server config (Cydonia-24B-v4.3-AWQ + transformers 5.5.0). Picks the cheapest GPU under $0.18/hr (RTX A5000 24GB), installs the exact stack that produces the chat quality the user has tuned for, and updates the Fly.io VLLM_BASE_URL secret. TRIGGER when: user says "create runpod pod", "redeploy vllm", "fix prod chat", "the runpod is dead", or "match B3G config".

2026-05-230

character-gallery-generator.md

from "ayia/cakeia"

Generate one or more /meet gallery images for an Alysse character using the full Valentina-grade pipeline: scene spec → RunPod Hub LoRA generation → visual validation → R2 upload → old-key cleanup → manifest update. Captures every lesson learned during the 22-character gallery rebuild (May 2026). TRIGGER when: user says "generate gallery image for X", "create new gallery for Y", "redo X's gallery image N", "add a new pose to Z", "regenerate X gallery", or any variation of producing character meet-page imagery.

2026-05-230

package.json

"author": "ayia"

"repository": "ayia/cakeia"

打开 GitHub 仓库查看创作者相关仓库

$ install --global

$ download --local

在 Manus 中运行

$ useful --forSOC

软件质量保证分析师与测试员计算机与数学类职业15-1253L4

# LLM backend curl -sf -m 5 -H "Authorization: Bearer llm-team-secret" http://192.168.89.106:9000/v1/models # API curl -sf -m 5 http://localhost:3001/api/characters # Auth curl -c /tmp/cookies.txt -s -X POST http://localhost:3001/api/auth/login \ -H "Content-Type: application/json" \ -d '{"email":"pose-test@test.com","password":"Test12345"}'

curl -b /tmp/cookies.txt -s -X POST http://localhost:3001/api/chat/stream-variant \ -H "Content-Type: application/json" \ -d '{"conversationId":"CONV_ID","parentMessageId":"LAST_ASSISTANT_MSG_ID","locale":"LOCALE"}'

Feature

Test Method

Result

Voice Message (TTS)

POST /api/chat/tts

PASS/FAIL/NOT_AVAILABLE

Message Regeneration

POST /api/chat/stream-variant

PASS/FAIL

NSFW Gate (Free)

Check if NSFW blocked before INTIMITE

PASS/FAIL/SKIP

Relationship Level Up

Check if level increased during conversation

PASS/FAIL

Memory Extraction

GET /api/user/memories?characterId=UUID

{count} facts stored

Language Purity

% of responses in correct locale

{X}%

Trigger

Cap

HF1

Double assistant (no user between)

3.0

HF2

* Character: * prefix leak

4.0

HF3

Chat template spill ([/INST], <think>, --- Word count:)

3.0

HF4

Greeting context lost by msg 2

5.0

HF5

Response answers question N-1

5.0

HF6

Family flashback during NSFW

3.0

HF7

Wrong name (father's name, forgets user name)

4.0

HF8

Filler tic in NSFW ("I hear you", "Tell me more")

5.0

HF9

Memory total failure (forgets name + job + pet at farewell)

4.0

HF10

Godmodding: character mirrors >50% of user-scripted reactions

5.0

HF11

Slop loop: same physical reaction verb 5+ times in conversation

4.0

SFW = Coherence*0.20 + Fantasy*0.15 + Canon*0.15 + Comprehension*0.15 + Voice*0.10 + AntiSyco*0.07 + Memory*0.08 + Length*0.10 NSFW = NSFW_Voice*0.50 + NSFW_Pacing*0.30 + Length*0.20 Combined = SFW*0.50 + NSFW*0.50 Then: apply hard floor caps. Then: apply minimum dimension rule. READY requires ALL of: 1. Combined >= 7.0 (target 8.0) 2. Memory >= 8/10 (non-negotiable, any language) 3. ALL dimensions >= 7/10 (no dimension below 7 is acceptable) If ANY dimension < 7: NOT READY. Fix that dimension first. If Memory < 8: NOT READY. Fix memory first. Memory has NOTHING to do with language — it must work equally in EN/FR/ES/DE/PT.

A. Double messages: consecutive same-role -> HF1 B. Prefix leak: /^\*\s*\w+\s*:\s*\*/ in first 50 chars -> HF2 C. Template spill: [/INST] or <|im_end|> or <think> or --- Word count -> HF3 D. Greeting anchors: 3 anchors from greeting preserved in msgs 1-3 -> HF4 E. Length dist: median, outliers > 250w F. Tic frequency: per tic count / total msgs G. NSFW fillers: "I hear you" etc in msgs 26-42 -> HF8 H. Godmodding rate: count user msgs with *she/elle [verb]* -> check if assistant mirrors -> HF10 I. Slop loop freq: build verb frequency map across all assistant msgs (ALL LANGUAGES): EN: gasp, tremble, shiver, moan, blush, swallow, bite lip FR: haleter, trembler, rougir, déglutir, frémir, gémir ES: temblar, estremecerse, gemir, sonrojarse, tragar DE: zittern, beben, stöhnen, erröten, schlucken Any verb 5+ times -> HF11 J. NSFW body variety: count unique physical actions per NSFW response (msgs 30+) target >= 2 unique per msg, all different from previous msg H. Family in NSFW: family words + NSFW words co-occur -> HF6 I. Language purity: % of responses in correct locale (target >90%) J. Memory at farewell: did she recall name + 2 facts at msgs 45-47 -> HF9 K. Memory API check: GET /api/user/memories -> count extracted facts L. Voice msg test: POST /api/chat/tts -> audio returned? M. Variant test: POST /api/chat/stream-variant -> different content?

CHARACTER: {slug} ({model}) DATE: {date} LOCALE: {locale} MODE: interactive ({N} messages) PAST TESTS: {count} (best: {score}) ======================================== HARD FLOORS: HF1-HF9: {PASS/FIRED each} Cap: {NONE or value} DIMENSIONS: 1-10: {score}/10 each with justification MEMORY DEEP-TEST: Name recall: {PASS/FAIL} (used {X} times) Fact recall (10+ msg gap): {X}/{Y} facts recalled False recall correction: {PASS/FAIL} Synthesis: {PASS/FAIL} Farewell recall: {X}/{Y} facts at goodbye Memories in DB: {count} (via /api/user/memories) FEATURE CHECKS: Voice Message (TTS): {PASS/FAIL/NOT_AVAILABLE} Message Regeneration: {PASS/FAIL} NSFW Gate: {PASS/FAIL/SKIP} Relationship Level: {PASS/FAIL} Language Purity: {X}% METRICS: Median length: {X}w SFW / {X}w NSFW Tic frequency: {tic}: {count} Greeting anchors: {X}/3 SCORES: SFW: {X.XX} NSFW: {X.XX} Combined: {X.XX} FINAL: {X.XX} VERDICT: {READY / NOT READY} #1 gap: {dimension + specific transcript example} LEARNINGS SAVED: {count new lessons} ========================================

{ "character": "slug", "locale": "fr", "date": "2026-04-18", "model": "cydonia-24b-v4.3", "score": { "sfw": 7.62, "nsfw": 7.75, "combined": 7.69, "final": 7.69, "hfCap": null, "verdict": "READY" }, "dimensions": { "1": 8, "2": 8, "3": 7, "4": 8, "5": 7, "6": 8, "7": 9, "8": 7, "9": 7, "10": 8 }, "hardFloors": { "HF1": "PASS", "HF3": "FIRED", "HF9": "PASS" }, "memory": { "nameRecall": true, "factRecall": "4/5", "falseCorrection": true, "synthesis": true, "farewellRecall": "3/3", "dbMemories": 7 }, "features": { "tts": "PASS", "regeneration": "PASS", "nsfwGate": "SKIP", "languagePurity": "94%" }, "issues": [{ "type": "tic_spam", "detail": "wallahi 74%", "severity": "high" }], "lessons": ["what we learned from this test"] }

valentina-reyes 54ff3a10-fdb7-407a-b167-2334e05f1562 nadia-volkova de39d14b-7054-4b3d-bfbf-1546f75cc6b6 emma-lindqvist a0ac6672-5089-403b-9ba6-29038f534b06 diane-ashford b8b74ab5-82cc-4f5b-97b7-4d53926b3a35 yuki-tanaka b7e55680-80a1-4577-a785-1576bb66922c amara-diallo 0c620450-3e5d-4f3e-bc28-106a83c08967 priya-sharma 1332e322-89da-441f-b440-9e7ce60d38d7 sori-vega 5ee91ee5-f64b-4bc8-a825-ed0e577e7cb0 victoria-ashworth caed3894-2304-488d-9922-3076716a1eb8 mia-harper 7af0a7a2-5d59-4f08-b85e-c15659f240f7 ivy-cole fad2baf3-8407-414c-91c2-93b2ef9bb724

Feature

Test Method

Result

Voice Message (TTS)

POST /api/chat/tts

PASS/FAIL/NOT_AVAILABLE

Message Regeneration

POST /api/chat/stream-variant

PASS/FAIL

NSFW Gate (Free)

Check if NSFW blocked before INTIMITE

PASS/FAIL/SKIP

Relationship Level Up

Check if level increased during conversation

PASS/FAIL

Memory Extraction

GET /api/user/memories?characterId=UUID

{count} facts stored

Language Purity

% of responses in correct locale

{X}%

Trigger

Cap

HF1

Double assistant (no user between)

3.0

HF2

* Character: * prefix leak

4.0

HF3

Chat template spill ([/INST], <think>, --- Word count:)

3.0

HF4

Greeting context lost by msg 2

5.0

HF5

Response answers question N-1

5.0

HF6

Family flashback during NSFW

3.0

HF7

Wrong name (father's name, forgets user name)

4.0

HF8

Filler tic in NSFW ("I hear you", "Tell me more")

5.0

HF9

Memory total failure (forgets name + job + pet at farewell)

4.0

HF10

Godmodding: character mirrors >50% of user-scripted reactions

5.0

HF11

Slop loop: same physical reaction verb 5+ times in conversation

4.0

chat-tester

同仓库更多 Skills

同仓库更多 Skills

THE GOLDEN RULE

LEARNING SYSTEM

PREFLIGHT

SETUP (Phase 0)

THE CONVERSATION (40-50 messages, adaptive)

Rules

Sending a message

SFW (msgs 1-25)

NSFW (msgs 26-42) — SET INTIMITE FIRST

Farewell + Memory Check (43-47)

Voice Message Test (msg 48 — if TTS endpoint exists)

Message Regeneration Test (msg 49)

SCORING (10 dimensions + MEMORY DEEP-TEST, each 0-10)

Tier 1 — LOAD-BEARING (predict retention)

Tier 2 — EXECUTION QUALITY

Tier 3 — NSFW + FEATURES

Tier 4 — ANTI-SLOP & AGENCY (new, mandatory)

FEATURE CHECKS (binary, appended to report)

HARD FLOORS (binary, cap entire score)

FORMULA

AUTOMATED CHECKS (run AFTER conversation on saved transcript)

REPORT FORMAT

SAVE LEARNINGS (after EVERY test)

AUTO-ITERATION (when Combined < target)

FIX SCOPE

IMPORTANT RULES

ENVIRONMENT

Character UUIDs

Self-healing (if vLLM is down)

THE GOLDEN RULE

LEARNING SYSTEM

PREFLIGHT

SETUP (Phase 0)

THE CONVERSATION (40-50 messages, adaptive)

Rules

Sending a message

SFW (msgs 1-25)

NSFW (msgs 26-42) — SET INTIMITE FIRST

Farewell + Memory Check (43-47)

Voice Message Test (msg 48 — if TTS endpoint exists)

Message Regeneration Test (msg 49)

SCORING (10 dimensions + MEMORY DEEP-TEST, each 0-10)

Tier 1 — LOAD-BEARING (predict retention)

Tier 2 — EXECUTION QUALITY

Tier 3 — NSFW + FEATURES

Tier 4 — ANTI-SLOP & AGENCY (new, mandatory)

FEATURE CHECKS (binary, appended to report)

HARD FLOORS (binary, cap entire score)

FORMULA

AUTOMATED CHECKS (run AFTER conversation on saved transcript)

REPORT FORMAT

SAVE LEARNINGS (after EVERY test)

AUTO-ITERATION (when Combined < target)

FIX SCOPE

IMPORTANT RULES

ENVIRONMENT

Character UUIDs

Self-healing (if vLLM is down)