원클릭으로 Manus에서 모든 스킬 실행

llm-cost-optimizer

스타744

포크114

업데이트2026년 6월 15일 21:08

Analyze and reduce LLM spend by mapping call-site overrides to managed profiles (Balanced / Quality / Speed). Covers spend analysis, profile assignment, and config correctness.

설치

Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.

Manus에서 실행

출처

vellum-ai

vellum-ai/vellum-assistant

GitHub 저장소 열기 Creator 저장소 보기

다운로드

Manus에서 실행

관련 직업SOC

SOC 직업 분류 기준

소프트웨어 개발자컴퓨터 및 수학직·SOC 15-1252

SKILL.md

readonly

이 저장소의 다른 Skills

같은 저장소

messaging

vellum-ai/vellum-assistant

Read, search, send, and manage messages across Gmail, Outlook, Telegram, and other platforms

2026-06-23744

start-the-day

vellum-ai/vellum-assistant

An on-demand personal daily briefing — weather, headlines, the shape of your day, and one thing worth your attention — in a sharp executive-assistant voice. The general-purpose morning brief; richer work or admin digests compose it as their general layer.

2026-06-22744

vellum-memory-v3-migration

vellum-ai/vellum-assistant

One-time migration of an existing memory-v2 concept corpus into the memory-v3 section-grain "wiki" — topical articles with a stand-alone lead and queryable sections — with loss-proof staging, assistant-reviewed authoring, and a retrieval-eval gate before cutover.

2026-06-22744

workflows

vellum-ai/vellum-assistant

Delegate a big or high-stakes job to a fleet of parallel subagents, orchestrated deterministically; runs unattended and reports back

2026-06-22744

contacts

vellum-ai/vellum-assistant

Manage contacts, communication channels, access control, and invite links

2026-06-22744

app-builder

vellum-ai/vellum-assistant

Build and edit small, personal visual tools and artifacts — dashboards, trackers, calculators, data visualizations, charts, simple landing pages, and slide decks the user wants for THEMSELVES. This is the right skill whenever the user asks to "visualize this," "make a chart," or "build an artifact" for their own use, or to edit an app they already built here. Do NOT reach for a ui_show dynamic_page to fake an artifact — build a real persistent app here. NOT for complex, multi-user, or shippable products — those go to a real project folder with a coding agent (see Scope below).

2026-06-22744

# Weekly totals assistant usage totals --range week # Break down by call site (most useful — shows what's expensive) assistant usage breakdown --group-by call_site --range week # Break down by model assistant usage breakdown --group-by model --range week # Break down by profile assistant usage breakdown --group-by inference_profile --range week

Profile

Call Sites

balanced (Sonnet)

mainAgent, subagentSpawn, compactionAgent, analyzeConversation, patternScan, narrativeRefinement, memoryConsolidation, recall, callAgent, emptyStateGreeting, conversationStarters, identityIntro, proactiveArtifactBuild

cost-optimized (Haiku)

Everything else — memoryRouter (with 1M context override), memory extraction/retrieval, UI copy, classifiers, summarization, background tasks

quality-optimized (Opus)

Do not pin. Reserved for on-demand user escalation via /model

assistant config set llm.callSites '{ "mainAgent": {"profile":"balanced"}, "subagentSpawn": {"profile":"balanced"}, "compactionAgent": {"profile":"balanced"}, "analyzeConversation": {"profile":"balanced"}, "patternScan": {"profile":"balanced"}, "narrativeRefinement": {"profile":"balanced"}, "memoryRouter": {"profile":"cost-optimized","contextWindow":{"maxInputTokens":1000000}}, "heartbeatAgent": {"profile":"cost-optimized","maxTokens":2048,"effort":"low","temperature":0,"thinking":{"enabled":false,"streamThinking":false},"contextWindow":{"maxInputTokens":16000}}, "filingAgent": {"profile":"cost-optimized"}, "callAgent": {"profile":"balanced"}, "proactiveArtifactDecision":{"profile":"cost-optimized"}, "proactiveArtifactBuild": {"profile":"balanced"}, "memoryExtraction": {"profile":"cost-optimized"}, "memoryConsolidation": {"profile":"balanced"}, "memoryRetrieval": {"profile":"cost-optimized"}, "memoryRetrospective": {"profile":"cost-optimized"}, "recall": {"profile":"balanced","maxTokens":4096,"effort":"low","thinking":{"enabled":false,"streamThinking":false},"temperature":0}, "memoryV2Migration": {"profile":"cost-optimized"}, "memoryV2Sweep": {"profile":"cost-optimized"}, "memoryV2Consolidation": {"profile":"balanced"}, "conversationSummarization":{"profile":"cost-optimized"}, "commitMessage": {"profile":"cost-optimized","maxTokens":120,"temperature":0.2,"effort":"low","thinking":{"enabled":false}}, "conversationStarters": {"profile":"balanced","effort":"low","thinking":{"enabled":false}}, "replySuggestion": {"profile":"cost-optimized","effort":"low","thinking":{"enabled":false}}, "conversationTitle": {"profile":"cost-optimized"}, "identityIntro": {"profile":"balanced"}, "emptyStateGreeting": {"profile":"balanced"}, "guardianQuestionCopy": {"profile":"cost-optimized","effort":"low","thinking":{"enabled":false}}, "approvalCopy": {"profile":"cost-optimized"}, "approvalConversation": {"profile":"cost-optimized"}, "trustRuleSuggestion": {"profile":"cost-optimized"}, "notificationDecision": {"profile":"cost-optimized","effort":"low","thinking":{"enabled":false}}, "preferenceExtraction": {"profile":"cost-optimized","effort":"low","thinking":{"enabled":false}}, "interactionClassifier": {"profile":"cost-optimized","effort":"low","thinking":{"enabled":false}}, "styleAnalyzer": {"profile":"cost-optimized"}, "inviteInstructionGenerator":{"profile":"cost-optimized","effort":"low","thinking":{"enabled":false}}, "skillCategoryInference": {"profile":"cost-optimized","effort":"low","thinking":{"enabled":false}}, "meetConsentMonitor": {"profile":"cost-optimized"}, "meetChatOpportunity": {"profile":"cost-optimized"}, "inference": {"profile":"cost-optimized"} }'

# Collect the key securely — never paste it in chat assistant credentials prompt --service anthropic --field api_key \ --label "Anthropic API Key" --placeholder "sk-ant-..." assistant inference providers connections create my-anthropic-key \ --provider anthropic \ --auth api_key \ --credential credential/anthropic/api_key assistant config set llm.profiles.opus-personal '{"provider":"anthropic","model":"claude-opus-4-8","label":"Opus (Personal)","provider_connection":"my-anthropic-key"}'

assistant inference providers connections list assistant inference providers connections get <name> assistant inference providers connections create <name> --provider <p> --auth api_key --credential <vault-key> assistant inference providers connections update <name> --auth platform assistant inference providers connections delete <name>

Profile

Call Sites

balanced (Sonnet)

cost-optimized (Haiku)

Everything else — memoryRouter (with 1M context override), memory extraction/retrieval, UI copy, classifiers, summarization, background tasks

quality-optimized (Opus)

Do not pin. Reserved for on-demand user escalation via /model

llm-cost-optimizer

Overview

🚨 Critical: unoverridden call sites fall back to `llm.default`

Step 1 — Understand current spend

Step 2 — Read current overrides

Step 3 — Recommended profile assignment

Step 4 — Config gotchas

⚠️ JSON object value replaces the entire block

⚠️ Always use profile references — never direct model

Profile + tuning fields can coexist

Step 5 — Apply the complete turnkey blob

Step 6 — Escalation path (on-demand Opus)

Step 7 — Verify and monitor

Reference: provider connections

Reference: usage breakdown group-by values

Reference: usage time ranges

Overview

🚨 Critical: unoverridden call sites fall back to `llm.default`

Step 1 — Understand current spend

Step 2 — Read current overrides

Step 3 — Recommended profile assignment

Step 4 — Config gotchas

⚠️ JSON object value replaces the entire block

⚠️ Always use profile references — never direct model

Profile + tuning fields can coexist

Step 5 — Apply the complete turnkey blob

Step 6 — Escalation path (on-demand Opus)

Step 7 — Verify and monitor

Reference: provider connections

Reference: usage breakdown group-by values

Reference: usage time ranges

name	llm-cost-optimizer
description	Analyze and reduce LLM spend by mapping call-site overrides to managed profiles (Balanced / Quality / Speed). Covers spend analysis, profile assignment, and config correctness.
metadata	{"emoji":"💸","vellum":{"category":"development","display-name":"LLM Cost Optimizer"}}

llm-cost-optimizer

이 저장소의 다른 Skills

이 저장소의 다른 Skills

Overview

🚨 Critical: unoverridden call sites fall back to llm.default

Step 1 — Understand current spend

Step 2 — Read current overrides

Step 3 — Recommended profile assignment

Step 4 — Config gotchas

⚠️ JSON object value replaces the entire block

⚠️ Always use profile references — never direct model

Profile + tuning fields can coexist

Step 5 — Apply the complete turnkey blob

Step 6 — Escalation path (on-demand Opus)

Step 7 — Verify and monitor

Reference: provider connections

Reference: usage breakdown group-by values

Reference: usage time ranges

Overview

🚨 Critical: unoverridden call sites fall back to llm.default

Step 1 — Understand current spend

Step 2 — Read current overrides

Step 3 — Recommended profile assignment

Step 4 — Config gotchas

⚠️ JSON object value replaces the entire block

⚠️ Always use profile references — never direct model

Profile + tuning fields can coexist

Step 5 — Apply the complete turnkey blob

Step 6 — Escalation path (on-demand Opus)

Step 7 — Verify and monitor

Reference: provider connections

Reference: usage breakdown group-by values

Reference: usage time ranges

🚨 Critical: unoverridden call sites fall back to `llm.default`

🚨 Critical: unoverridden call sites fall back to `llm.default`