원클릭으로 Manus에서 모든 스킬 실행

confidence-scoring

스타0

포크0

업데이트2026년 3월 28일 22:12

Confidence-based scoring system for review findings — calibrates issue severity with evidence strength to filter false positives. Use when reviewing code, analyzing findings, or scoring issues.

설치

Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.

Manus에서 실행

출처

kennedym-ds

kennedym-ds/claudecode-orchestrator

GitHub 저장소 열기 Creator 저장소 보기

다운로드

Manus에서 실행

Confidence Scoring

Scoring Scale

Rate every finding 0-100 based on evidence strength:

Score	Level	Meaning	Action
0-24	Not confident	Likely false positive, insufficient evidence	Filter out
25-49	Somewhat confident	Might be real, needs investigation	Note only
50-74	Moderately confident	Real but minor, or real but context-dependent	Include if reviewer
75-89	Highly confident	Real and important, strong evidence	Always include
90-100	Certain	Verified issue with definitive evidence	Flag as critical

Default Threshold

80 — only findings scored ≥80 are reported to the user by default. Configurable in sdlc-config.md.

Scoring Criteria

For each finding, evaluate:

Evidence strength — Can you point to exact code that proves the issue?
Context relevance — Is this actually a problem in this codebase's context?
Not pre-existing — Was this introduced by the change, or was it already there?
Not linter territory — Will a linter catch this? If so, don't duplicate.
Specificity — Can you describe exactly what breaks and how?

Applying Confidence Scores

### Finding: {description}
- **Confidence:** {score}/100
- **Evidence:** {exact code reference}
- **Category:** Bug | Security | Convention | Performance | Maintainability
- **Recommendation:** {specific fix}

False Positive Indicators (lower score)

Issue exists in unchanged code (pre-existing)
Linter or type checker would catch it
Pattern is intentional based on CLAUDE.md or codebase conventions
Theoretical risk with no practical exploitation path
Pedantic style preference

이 저장소의 다른 Skills

같은 저장소

demo-flow

kennedym-ds/claudecode-orchestrator

Demo pipeline state machine — 7-phase autonomous sequence with delegation context templates, phase transition logic, BLOCKED recovery strategies, and demo-state.json schema. Used exclusively by demo-conductor.

2026-03-310

demo-narration

kennedym-ds/claudecode-orchestrator

Cinematic narration style guide for demo-conductor — ANSI-coloured banner formats, live pipeline scoreboard, audience-facing language, phase summaries, and error narration patterns. Keeps the demo presentation-quality throughout.

2026-03-310

completion-protocol

kennedym-ds/claudecode-orchestrator

Standardized completion and escalation protocol for subagent responses. Ensures the conductor can machine-parse every subagent return. Use when reporting completion status back to the orchestrator.

2026-03-300

learnings-mgmt

kennedym-ds/claudecode-orchestrator

Cross-session learnings lifecycle — schema, storage, retrieval, and pruning of lessons learned during orchestrator sessions. Use when managing learnings via the /learn command.

2026-03-300

team-routing

kennedym-ds/claudecode-orchestrator

Agent Teams assembly and task injection — selects appropriate team, validates prerequisites, estimates cost, injects tasks into the shared task list, and manages team lifecycle.

2026-03-300

budget-gatekeeper

kennedym-ds/claudecode-orchestrator

Token and cost tracking with model tier enforcement

2026-03-290

name	confidence-scoring
description	Confidence-based scoring system for review findings — calibrates issue severity with evidence strength to filter false positives. Use when reviewing code, analyzing findings, or scoring issues.
user-invocable	false

Confidence Scoring

Scoring Scale

Rate every finding 0-100 based on evidence strength:

Score	Level	Meaning	Action
0-24	Not confident	Likely false positive, insufficient evidence	Filter out
25-49	Somewhat confident	Might be real, needs investigation	Note only
50-74	Moderately confident	Real but minor, or real but context-dependent	Include if reviewer
75-89	Highly confident	Real and important, strong evidence	Always include
90-100	Certain	Verified issue with definitive evidence	Flag as critical

Default Threshold

80 — only findings scored ≥80 are reported to the user by default. Configurable in sdlc-config.md.

Scoring Criteria

For each finding, evaluate:

Evidence strength — Can you point to exact code that proves the issue?
Context relevance — Is this actually a problem in this codebase's context?
Not pre-existing — Was this introduced by the change, or was it already there?
Not linter territory — Will a linter catch this? If so, don't duplicate.
Specificity — Can you describe exactly what breaks and how?

Applying Confidence Scores

### Finding: {description}
- **Confidence:** {score}/100
- **Evidence:** {exact code reference}
- **Category:** Bug | Security | Convention | Performance | Maintainability
- **Recommendation:** {specific fix}

False Positive Indicators (lower score)

Issue exists in unchanged code (pre-existing)
Linter or type checker would catch it
Pattern is intentional based on CLAUDE.md or codebase conventions
Theoretical risk with no practical exploitation path
Pedantic style preference