بنقرة واحدة
evaluation
Reference templates for Codex evaluation. Used by build/improve orchestrators — not executed directly.
التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.
القائمة
Reference templates for Codex evaluation. Used by build/improve orchestrators — not executed directly.
التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.
استنادا إلى تصنيف SOC المهني
Read-only quality scan of components. Reports problems without making changes. Uses software-base + domain profile skills.
Refactoring patterns - improving code design without changing behavior
Read-only quality scan of components. Reports problems without making changes. Uses software-base + domain profile skills.
Internal phase: independent Codex review + targeted fixes. Not user-facing.
Find duplicated code and consolidate into shared utilities. Fixes all duplicates.
Hard-ass code review via Gemini. ALL issues must be fixed. No exceptions.
| name | evaluation |
| description | Reference templates for Codex evaluation. Used by build/improve orchestrators — not executed directly. |
Templates for the Phase 8 evaluation loop. The orchestrator in /build and /improve reads these templates and runs scoring via Bash.
This file is NOT executed directly. The orchestrator owns the score-fix loop. Scoring runs via codex exec in Bash — never delegated to an agent (agents fabricate scores).
.claude/rubric/AUTO-DETECT.md for the detection table.claude/rubric/base.md and .claude/rubric/product-quality.md{RUBRIC_CRITERIA}If a rubric file doesn't exist, skip it and continue.
The orchestrator runs this directly via Bash:
cd {TARGET} && codex exec -s read-only -o /tmp/lens-eval-scores.md "CODE QUALITY REVIEW
Rate this codebase on a scale of 1-100. Evaluate everything: code quality, security, error handling, naming, structure, test coverage, CI/CD, documentation, and project hygiene.
Also check against these criteria:
{RUBRIC_CRITERIA}
Every issue you report will be sent to an agent for fixing. Be specific — cite the exact file and line, and say exactly what needs to change.
OUTPUT FORMAT (strict — no prose, no strengths, no explanation):
ISSUE: {file:line} — {description}
ISSUE: {file:line} — {description}
...
SCORE: NN/100" 2>&1
After fixes are applied, the orchestrator runs this to get the final score:
cd {TARGET} && codex exec -s read-only -o /tmp/lens-eval-scores.md "CODE QUALITY RE-SCORE
Previous score: {PREVIOUS_SCORE}/100
Fixes applied since last scoring:
{FIX_APPLIED_LINES}
Re-read the codebase and re-score 1-100. Every issue you report will be sent to an agent for fixing. Be specific.
OUTPUT FORMAT (strict — no prose, no strengths, no explanation):
ISSUE: {file:line} — {description}
...
SCORE: NN/100" 2>&1
For each fix applied, the LESSON agent classifies:
Code pattern that should be avoided in future code?
YES -> General rule?
YES -> LESSON -> both lessons files (deduped)
NO -> LESSON -> .claude/lessons.md only
NO -> Suggests pipeline/tool/config change?
YES -> PROPOSAL -> .claude/eval-proposals.md
NO -> eval-report.md only
Each LESSON gets a category: LOGIC, DESIGN, CODE_QUALITY, DUPLICATION, or AI_SMELL.
The LESSON agent replaces .claude/eval-report.md with:
# Eval Report — {TARGET}
**Date:** {ISO date}
**Evaluator:** Codex
**Score:** {initial}/100 → {final}/100
## Issues Found ({count})
| # | File | Issue |
|---|------|-------|
| 1 | {file:line} | {description} |
## Fixes Applied ({count})
| # | File | Fix |
|---|------|-----|
| 1 | {file:line} | {what was fixed} |
## Remaining Issues ({count})
| # | File | Issue |
|---|------|-------|
| 1 | {file:line} | {description} |
## Lessons ({count})
| # | Category | Description |
|---|----------|-------------|
| 1 | {cat} | {desc} |
## Proposals ({count})
| # | Type | Description | Action |
|---|------|-------------|--------|
| 1 | {type} | {desc} | {action} |
.claude/lessons.md — append new lessons under appropriate category sections.claude/universal-lessons.md — append only general patterns (not project-specific), deduplicate against existing.claude/eval-proposals.md — append new proposals with PENDING status