تشغيل أي مهارة في Manus بنقرة واحدة

quality

Use when the goal is to understand and improve the repo's current quality bar. Detect existing gates, run the available ones, inspect concept integrity, test confidence, and security posture, then propose concrete next gates instead of only complaining about what is missing.

تشغيل في Manus

نظرة عامة

أمر التثبيت

npx skills add https://github.com/corca-ai/charness --skill quality

انسخ والصق هذا الأمر في Claude Code لتثبيت المهارة

المصدر

corca-ai/charness

النجوم٤

التفرعات٢

آخر تحديث٣ يونيو ٢٠٢٦ في ١٣:٠٠

مستكشف الملفات

91 ملفات

SKILL.md

readonly

المزيد من هذا المستودع

نفس المستودع

achieve

corca-ai/charness

Use when operating a long-running autonomous objective as an auditable goal lifecycle: interview prose intent into a reviewable goal artifact under charness-artifacts/goals/, keep slice progress and verification visible during the run, and prove the goal with honest non-claims at the end. Coordinates ideation/spec/impl/quality/issue/critique/retro around one goal artifact instead of replacing them, and stays a goal operator rather than a task execution engine.

2026-06-034

find-skills

corca-ai/charness

Use when the user asks which skill, support capability, or integration should handle a task, or names a skill/support/capability such as `X skill`, `X 스킬`, `support/X`, or `X integration`. Call this before filesystem search for named capabilities; support skills are intentionally hidden from the default skill list.

2026-06-034

retro

corca-ai/charness

Use after a meaningful work unit or when the user asks for a retrospective. Reviews what happened, what created waste, which decisions mattered, which named expert lens or direct counterfactual would have changed the next move, and which workflow/capability/memory improvements should make the next session better. Auto-selects `session` or `weekly` mode from context; ambiguous cases default to `session`.

2026-06-034

issue

corca-ai/charness

Use when filing a GitHub issue from current context or resolving GitHub issues end-to-end through the adapter-resolved backend (`gh` by default, or a host-mediated capability such as `ceal github`). Issue creation reports the observed problem before suggesting solutions; issue resolution treats GitHub as the source of truth, classifies the issue, runs a causal review for bug-class issues before designing the fix, and runs a resolution critique so the same class of issue does not recur.

2026-06-034

release

corca-ai/charness

Use when a maintainer needs to cut, bump, or verify a repo release surface such as plugin versions, generated install manifests, and operator update instructions.

2026-06-034

debug

corca-ai/charness

Use when investigating a bug, error, or unexpected behavior. Follow a disciplined root-cause workflow, preserve a durable debug artifact so future sessions inherit what was learned, and do not jump to fixes before a falsifiable hypothesis exists.

2026-06-014

المصدر

corca-ai

corca-ai/charness

فتح مستودع GitHub عرض مستودعات المنشئ

أمر التثبيت

تنزيل

تشغيل في Manus

مفيد لـSOC

محللو ضمان جودة البرمجيات والمختبرونمهن الحاسوب والرياضيات15-1253L4

name	quality
description	Use when the goal is to understand and improve the repo's current quality bar. Detect existing gates, run the available ones, inspect concept integrity, test confidence, and security posture, then propose concrete next gates instead of only complaining about what is missing.

Quality

Use this when the task is overall quality posture, not one narrow bug or isolated test. quality covers concept integrity, behavior proof, security and supply-chain posture, docs drift, skill drift, agent production runtime risk, and operator sustainability.

Default to inspecting the system that produces quality, running existing gates, and making concrete next gates or cleanups. Prefer deterministic enforcement over repeated prose when a linter, validator, test, hook, script, or command can own the concern. Use concept review when unresolved boundary, ownership, or architecture is the real issue. Length, duplicate, and pressure heuristics are structural smell sensors; the win is delete, merge, split ownership, extract a helper, or narrow an interface.

Bootstrap

Resolve $SKILL_DIR per ../../shared/references/bootstrap-resolution.md, then resolve the adapter and re-derive the source, spec, artifact, and gate surface before trusting a prior review.

Key references for the frequent path are references/bootstrap-escalations.md, references/inventory-dispatch.md, references/skill-ergonomics.md, and references/skill-quality.md.

# Required Tools: rg
# Missing-binary protocol: ../../shared/references/binary-preflight.md
python3 "$SKILL_DIR/scripts/resolve_adapter.py" --repo-root .
python3 "$SKILL_DIR/scripts/bootstrap_adapter.py" --repo-root .
python3 "$SKILL_DIR/scripts/resolve_quality_artifact.py" --repo-root . --intent current
rg --files .
sed -n '1,220p' docs/handoff.md 2>/dev/null || true
git status --short

If the adapter is missing, continue with inferred defaults. Scaffold or refresh one when the repo already has stable gate commands, installed command groups, concept paths, preset lineage, or deferred setup worth recording.

Use references/bootstrap-escalations.md for missing validation tools, reader-facing Markdown preview, evaluator-backed behavior proof, artifact write-path handling, and other non-default bootstrap paths. When writing the quality artifact, edit the resolved write_artifact_path, not latest.md by habit.

Workflow

Restate the quality question: scope, likely wrong boundary or ownership seam, and whether read-only constraints downgrade conclusions.
Detect the current gate and source surface.
- enumerate source, specs, docs, adapters, skill surfaces, and commands before letting the previous quality artifact define scope
- use references/inventory-dispatch.md to choose focused inventories for CLI/operator surfaces, docs/readability, skills, runtime/economics, source hygiene, security/supply chain, coverage/eval depth, and adapter policy
- when skills are in scope, run skill ergonomics inventory; when public-skill or durable artifact behavior is in scope, scaffold one consumer-side dogfood case with python3 "$SKILL_DIR/scripts/suggest_public_skill_dogfood.py" --repo-root . --skill-id <skill-id>
- when validators pin prose or support-skill routing is implicit, inspect whether the gate proves a behavior contract or only freezes wording and whether find-skills can surface the support seam from task language
- treat migration-time exact-prose guards as temporary bridges; when inventory_skill_ergonomics.py is cited, closeout uses prose review result: to record judgment separately from script fields
Run the meaningful gates that already exist.
- prefer repo-native commands over hypothetical recommendations
- resolve and run the Charness package-root validator validate_usage_episodes.py and report report_usage_episodes.py; no_adapter, disabled, and no_records are skipped warnings, not product-success proof or failures
- keep exit-zero attention states visible with validate_attention_state_visibility.py; new skipped/advisory states need warning output, artifact-visible status, or an explicit local-noop declaration
- run executable-spec overlap or cost guards before proposing more spec coverage
- for standing-test economics, testability, or affected-test-selection questions, inspect duplicated proof, runner isolation, startup cost, hot spots, and structural seams before pruning tests, widening budgets, or adding observation tools; use references/testability-and-selection.md
- before reporting runtime trends, prefer $SKILL_DIR/scripts/render_runtime_summary.py; if structured samples are missing, report that as the next gate instead of inventing numbers
Inspect four quality lenses.
- concept: does the repo still match its claimed architecture and ownership model
- behavior: do tests, evals, checks, probes, and command surfaces prove falsifiable behavior without making the test surface unmaintainable
- security: are code, secret, dependency, and supply-chain risks covered by repo-local proof or honestly deferred
- operability: are setup, CI, hooks, docs, install/update flows, runtime budgets, and maintenance surfaces honest enough to sustain the bar
Classify each finding by enforcement tier first: AUTO_EXISTING, AUTO_CANDIDATE, or NON_AUTOMATABLE.
Classify posture as healthy, weak, missing, or defer.
Propose concrete next quality moves.
- tag each recommended next gate as active or passive
- name the exact seam, command family, setup, or deletion/merge/split move
- implement clear repo-owned automation unless the user asked for review only
- do not leave dual implementations, duplicated proof, or stale command surfaces in an unpriced "keep both for safety" state
Run a bounded fresh-eye reviewer after initial inventory and before broad recommendations as high-leverage, applying host-exposed reviewer_tiers.high-leverage fields. Report exactly Delegated Review: executed|blocked|not_applicable. Blocked requires a concrete host or tool signal; use ../../shared/references/fresh-eye-subagent-review.md.
End with a quality posture summary that does not hide Weak, Missing, Advisory, delegated-review status, or active recommended next gates just because the implemented slice or final gate passed.

Load-Bearing Anchors

Validators and consumer prompts use these core routing anchors; references carry detail.

The prior quality artifact is history; a fresh 5-minute reader can misclassify as absent an invariant that is merely scattered, so do not dismiss that as reader noise.
For evaluator-backed behavior closeout, prompt regression, baseline compare, or operator reading test, use quality before downgrading to HITL. Generic review, closeout, or "run quality" wording is not enough to run an evaluator.
For recommend-only behavior robustness findings, use $SKILL_DIR/scripts/recommend_behavior_test.py so artifacts carry the Cautilus robustness request/report vocabulary instead of prose-only advice.
Before invoking any cautilus evaluate ... subcommand, consult the planner-consult contract at references/cautilus-on-demand.md; for recommend-only behavior tests, use references/behavior-testing.md. Supported modes are fixture, observation, and skill-experiment. Refuse when next_action: "none" or when must_ask_before_running: true without a named failing-log path, and route the call through the repo-owned wrapper instead of bare cautilus evaluate.
When the next quality move is repo-local, deterministic, and low-risk, prefer implementing that gate in the same turn; when the automatable move is already clear and repo-owned, implement it in the same turn unless review-only was requested. If you stop short of an obvious repo-owned deterministic gate, name the unresolved enforcement gap.
Do not stop at producer-side validators alone when the risk is public-skill routing or durable artifact behavior; scaffold one consumer-side dogfood case with python3 "$SKILL_DIR/scripts/suggest_public_skill_dogfood.py" --repo-root . --skill-id <skill-id>. Inventories named in ## Commands Run must engage with at least two distinct declared non-headline fields (one is enough when only one is declared); the validate-inventory-consumption phase fails closeout when the artifact summarizes a cited inventory by headline only, and validate-inventory-consumption-declaration plus check-inventory-declaration-coverage keep the declaration drift-free and complete (declaration: references/inventory-consumer-fields.json).
Skill review uses $SKILL_DIR/scripts/inventory_skill_ergonomics.py, skill ergonomics, mode/option pressure, trigger overlap, undertrigger risk, taste policing, and repeated prose ritual checks.
CLI/operator review uses $SKILL_DIR/scripts/inventory_cli_ergonomics.py, flat help-list, multiple archetype schema namespaces, $SKILL_DIR/scripts/inventory_cli_side_effect_probes.py, option-looking positional rejection, mutating command probes, and command-docs drift gate checks.
Docs/spec review uses $SKILL_DIR/scripts/inventory_entrypoint_docs_ergonomics.py, entrypoint-doc ergonomics, smart agent/operator can infer safely, doc-set dogma, ordinary Markdown uses the markdown preview seam, and executable specs use the rendered Specdown report.
Public-spec review uses $SKILL_DIR/scripts/inventory_public_spec_quality.py; ask what proof is duplicated at the wrong layer before adding more specs, and surface total source-guard rows, top specs, brittle count, and next action category together.
Runtime review uses $SKILL_DIR/scripts/inventory_standing_gate_verbosity.py, $SKILL_DIR/scripts/inventory_standing_test_economics.py, standing-gate-verbosity.md, file/process/startup cost, runner isolation/process mode, verbose-on-demand escape hatch, quiet failure output must still name the failing unit, top-N runtime hot spots, serial fallback, runtime_budget_profiles, Pytest Economics, and bounded test-ratio posture.
Testability review uses references/testability-and-selection.md: prefer structure that makes fast subsets predictable before relying on observation tools, keep real-boundary smokes thin, classify hidden broad-test compensation as a design smell rather than only a runtime smell, distinguish structural test cleanup (extract builders/wrappers/runners/shared assertions, keep intent in .test.*) from hiding test bodies in support files, and on duplicate/pressure-gate failure name the smallest next structural cleanup instead of only reporting the percentage.
Boundary-bypass ratchets use references/boundary-bypass-ratchet.md: quality owns the portable payload, no-increase, and exemption contract; consumer repos own stack-specific probes and DSLs.
Source hygiene review uses $SKILL_DIR/scripts/inventory_dual_implementation.py, free safety oracle checks, $SKILL_DIR/scripts/inventory_lint_ignores.py, lint suppressions start to accumulate, lint suppression pressure, growing lint suppressions, retained policy-level ignores, and concrete revisit conditions.
Language baselines stay explicit: For Python, default to ruff check as the standing lint path, include C90, and choose exactly one type checker (mypy or pyright). For JavaScript/TypeScript, default to eslint, use tsc --noEmit when TypeScript is present, and turn on a complexity rule. This is a routing default, not a veto against good deterministic enforcement; do not over-apply it to standing threshold gates such as coverage floors, runtime budgets, or other already-honest enforced limits.
prefer the smaller production surface first when the same confidence gap can be closed by shrinking production branches/interfaces or adding more tests.
Watch stale gate wiring and hidden network/external-repo work in maintainer-local enforcement.
When repo evidence shows a production LLM or agent runtime, use references/agent-production-runtime.md: review cache/cost economics, overload fallback, retry idempotency, streaming stalls, model routing economics, and telemetry as evidence questions, not provider-specific architecture mandates.
For external/runtime capability slices, treat readiness-only proof (surface, worker_queued, healthcheck-style host_decision) as Weak until at least one provider_roundtrip is observed; see ../../shared/references/external-capability-proof-ladder.md.
when prompt-sensitive output matters or prompt_asset_policy.source_globs is configured, inspect prompt/content bulk. prompt_asset_roots: [] only means no canonical asset root is declared, not that inline prompt/content bulk inventory should be skipped. The final user-facing answer must not silently omit Weak, Missing, Advisory, delegated-review status, or active Recommended Next Gates findings.
Do not treat a passing length, duplicate, or pressure heuristic as the goal; delete, merge, split ownership, extract a helper, or narrow the interface.

Output Shape

Scope, Concept Risks, Current Gates, Runtime Signals, Standing Test Economics, Testability and Selection, Coverage and Eval Depth, Maintainer-Local Enforcement, CI/Local Gate Parity, Enforcement Triage, Healthy, Weak, Missing, Deferred, Advisory, Delegated Review, Commands Run, Recommended Next Gates

Guardrails

Do not split bounded repo-local quality setup into another public concept.
Do not recommend gates the repo cannot realistically run without saying why.
Do not treat a passing metric or green gate as the goal; name the structural simplification or ownership clarification.
Do not leave automatable rules as prose-only guidance.
Do not treat a passing local gate as sufficient when clones lack a repo-owned pre-push path (no no-hook waiver), or when CI appends required run: steps or CI-only gates after it; required proof must be reachable locally. See references/maintainer-local-enforcement.md.
Do not give generic "add tests" or "improve security" advice without the seam and next setup.
Do not dismiss fresh-eye misreads when scattered evidence or undeclared enforcement is the real gap.
Do not stop at producer-side validators when public-skill routing or durable artifact behavior is the risk; run or name a consumer prompt and artifact.
Do not write size, runtime, or cost numbers that did not come from a command run this turn; label estimates explicitly. See references/proposal-flow.md.
Do not propose a new enforcement gate for an advisory cost before checking git log -S, grep -rn, and pyproject.toml markers for an existing convention; if one is ignored, the recommendation is the routing fix. See references/proposal-flow.md.

References

references/adapter-contract.md
references/adapter-gate-review.md
references/agent-production-runtime.md
references/attention-state-visibility.json
references/automation-promotion.md
references/bootstrap-escalations.md
references/bootstrap-posture.md
references/brittle-source-guards.md
references/boundary-bypass-ratchet.md
references/boundary-bypass-payload.example.json
references/cautilus-on-demand.md
references/behavior-testing.md
references/cli-ergonomics-smells.md
references/coverage-floor-exemptions.txt
references/coverage_floor_inventory.py
references/coverage-floor-policy.md
references/dual-implementation-parity.md
references/entrypoint-docs-ergonomics.md
references/executable-spec-economics.md
references/find_inline_prompt_bulk.py
references/gate-classification.md
references/installable-cli-probes.md
references/inventory-consumer-fields.json
references/inventory-dispatch.md
references/lint-ignore-discipline.md
references/maintainer-local-enforcement.md
references/mutation-testing.md
references/operability-signals.md
references/prompt-asset-policy.md
references/proposal-flow.md
references/public-spec-layering.md
references/quality-lenses.md
references/sample-presets.md
references/security-overview.md
references/security-npm.md
references/security-pnpm.md
references/security-uv.md
references/skill-quality.md
references/skill-ergonomics.md
references/standing-gate-verbosity.md
references/startup-probes.md
references/testability-and-selection.md
references/validate_spec_pytest_references.py
../../shared/references/agent-assessment-invariant.md
../../shared/references/fresh-eye-subagent-review.md
../../shared/references/external-capability-proof-ladder.md