name	analyze-ci-failures
description	Analyze CircleCI/CI shell test failures for CUBRID PRs. Reads failed TC list, fetches CI results, reads test scripts and answer files, categorizes failures by root cause, and generates a structured report. Use when CI tests fail and the user wants to understand why.

CUBRID CI Test Failure Analyzer

Analyze failed shell test cases from CircleCI (or other CI) for CUBRID PRs. Produces a categorized report with root cause analysis and fix proposals.

When to Use

User says "analyze ci failures", "CI 실패 분석", "왜 TC 실패했어", "failed tc 분석"
User shares a CircleCI URL with test failures
User has a failed_tc.txt or similar list of failed test cases
User wants to understand why shell tests failed on a PR

Arguments

/analyze-ci-failures — Interactive: look for failed_tc.txt in cwd
/analyze-ci-failures <circleci-url> — Fetch failures from CircleCI
/analyze-ci-failures <file> — Read failure list from specified file

Inputs

The skill needs:

Failed TC list: A file listing failed test case paths (e.g., failed_tc.txt), or a CircleCI URL to fetch from
Test case directory: A directory containing the actual test scripts and answer files (e.g., ~/cubrid-testcases-private-ex)
Feature context: The branch/PR being tested (to understand what changes might cause failures)

Execution Steps

Step 1: Gather Inputs

Locate the failed TC list:
- Check arguments for a file path or CircleCI URL
- Check cwd for failed_tc.txt
- Ask user if not found
Identify the test case base directory:
- Check if ~/cubrid-testcases-private-ex exists
- Check additional working directories
- Ask user if not found
Identify the feature branch context:
- git branch --show-current
- git log --oneline HEAD --not develop | head -20 to understand feature changes

Step 2: Fetch CI Failure Details

If a CircleCI URL is provided, use an agent to fetch the actual test failure messages (diffs, error outputs). This provides the actual vs expected output, which is critical for root cause analysis.

Step 3: Read All Failed Test Cases (Parallel)

For each failed TC:

Read the test script (.sh file in cases/ directory)
Read the answer file (.answer file — expected output)
Read supporting SQL files (.sql files used by the test)
Note what the test does: data types involved, operations tested (CRUD, unload/load, copydb, diagdb, etc.)

Use parallel reads — launch all file reads at once since they're independent.

Step 4: Analyze Feature Changes

Understand what the feature branch changes that could cause failures:

Read key modified source files (use git diff develop...HEAD --stat)
Identify behavioral changes:
- New file types or storage mechanisms
- Changed output formats (diagdb, show heap header, etc.)
- Disabled or stubbed functions
- New error codes or changed error messages
Use explore agents in parallel for deeper code analysis if needed

Step 5: Categorize and Analyze

For each failed TC, determine:

Is it related to the feature? — Match test operations against feature changes
Root cause hypothesis — Why the test fails given the feature changes
Category — Group TCs by shared root cause

Common categories:

Output format mismatch (answer file needs update)
Disabled/stubbed functionality
New storage path not handled by existing tool (unloaddb, copydb, etc.)
Error code changes
Timeout / CI flakiness
Unrelated regression

Step 6: Generate Report

Write a structured markdown report with:

# Failed TC Analysis Report: <branch> (<PR link>)

## Background
(Feature description, key behavioral changes)

## Category N: <Root Cause> (X TCs) — <OOS-related? / Feature-related?>

| # | TC | What it tests | Failure | Related? |
|---|-----|--------------|---------|----------|
| ... | ... | ... | ... | ... |

**Root cause analysis**: ...
**Proposed fix**: ...

## Summary

| Category | Count | Related? | Root Cause |
|----------|-------|----------|------------|
| ... | ... | ... | ... |

## Priority Actions
1. P0: ...
2. P1: ...

Step 7: Save and Present

Save report to failed_tc_report.md in the project root (or user-specified location)
Print a concise summary with counts: X related, Y unrelated, Z total

Output Conventions

Language

Section headers (##): English
Table content and analysis: English (technical report for broad audience)
Code, paths, function names: Keep as-is

Style

Tables for TC listings within each category
Code blocks for call flow diagrams showing broken vs expected paths
Bold for key findings and root causes
Backticks for all code references
Horizontal rules between major sections

Tips

When in doubt about source code behavior, use LSP (clangd) to analyze CUBRID C/C++ code. Use lsp_hover to check types, lsp_goto_definition to trace function implementations, lsp_find_references to understand call sites, and lsp_diagnostics to catch issues. This is especially useful when tracing how a changed function affects downstream callers.
When fetching CircleCI results, use API v1.1 (not v2) since v2 requires authentication. Example: https://circleci.com/api/v1.1/project/github/CUBRID/cubrid/<build_num>/tests works without credentials.
Always read the actual test script AND answer file — the script tells you what operations are tested, the answer tells you what output is expected
Look for data types that exceed storage thresholds (e.g., varchar(20000), large JSON, CLOB/BLOB)
Check for diagdb, show heap header, cubrid spacedb in test scripts — these are sensitive to storage format changes
Check for unloaddb/loaddb/copydb — these require full record resolution
TCs with no diff details from CI may need local reproduction to diagnose
Group by root cause, not by symptom — multiple TCs often share a single underlying issue

Mandatory: Iterate with Grill-and-Revise

Every CI failure report must go through /grill-and-revise before being shared. Do not deliver a single-pass triage. Single-pass triage drifts toward weak root-cause hypotheses, mis-categorized TCs, and unsupported "Related?" calls. CI reports often drive merge or release decisions where mis-attribution is expensive.

This step is required, not optional. It applies to every report. No agent-side judgment — including size, scope, perceived triviality, or perceived risk — is a valid skip criterion. The only legitimate skip is when the user, in the message that triggered this skill, explicitly says "skip grill" or "don't grill this" (or unambiguous equivalent: "no grill", "skip the grill loop", "just push it"). If in doubt, do the grill loop.

How to hand off:

After saving the initial report to failed_tc_report.md, invoke /grill-and-revise with:

Topic & purpose: CI failure analysis for <branch> / <PR link>, audience is the PR author, QA, and CUBRID maintainers
Output path: the same report file (the loop revises in place)
Source material: the failed TC list, CircleCI output, the actual test scripts and answer files, git diff develop...HEAD summary, key modified source files
Review angle: every "Related?" call is justified by a concrete behavioral change, root-cause hypotheses are testable (not hand-wavy), proposed fixes are concrete (file/function-level), categorization groups by underlying cause not symptom, Priority Actions are actionable
Round cap: default 5

name	analyze-ci-failures
description	Analyze CircleCI/CI shell test failures for CUBRID PRs. Reads failed TC list, fetches CI results, reads test scripts and answer files, categorizes failures by root cause, and generates a structured report. Use when CI tests fail and the user wants to understand why.

CUBRID CI Test Failure Analyzer

Analyze failed shell test cases from CircleCI (or other CI) for CUBRID PRs. Produces a categorized report with root cause analysis and fix proposals.

When to Use

User says "analyze ci failures", "CI 실패 분석", "왜 TC 실패했어", "failed tc 분석"
User shares a CircleCI URL with test failures
User has a failed_tc.txt or similar list of failed test cases
User wants to understand why shell tests failed on a PR

Arguments

/analyze-ci-failures — Interactive: look for failed_tc.txt in cwd
/analyze-ci-failures <circleci-url> — Fetch failures from CircleCI
/analyze-ci-failures <file> — Read failure list from specified file

Inputs

The skill needs:

Failed TC list: A file listing failed test case paths (e.g., failed_tc.txt), or a CircleCI URL to fetch from
Test case directory: A directory containing the actual test scripts and answer files (e.g., ~/cubrid-testcases-private-ex)
Feature context: The branch/PR being tested (to understand what changes might cause failures)

Execution Steps

Step 1: Gather Inputs

Locate the failed TC list:
- Check arguments for a file path or CircleCI URL
- Check cwd for failed_tc.txt
- Ask user if not found
Identify the test case base directory:
- Check if ~/cubrid-testcases-private-ex exists
- Check additional working directories
- Ask user if not found
Identify the feature branch context:
- git branch --show-current
- git log --oneline HEAD --not develop | head -20 to understand feature changes

Step 2: Fetch CI Failure Details

If a CircleCI URL is provided, use an agent to fetch the actual test failure messages (diffs, error outputs). This provides the actual vs expected output, which is critical for root cause analysis.

Step 3: Read All Failed Test Cases (Parallel)

For each failed TC:

Read the test script (.sh file in cases/ directory)
Read the answer file (.answer file — expected output)
Read supporting SQL files (.sql files used by the test)
Note what the test does: data types involved, operations tested (CRUD, unload/load, copydb, diagdb, etc.)

Use parallel reads — launch all file reads at once since they're independent.

Step 4: Analyze Feature Changes

Understand what the feature branch changes that could cause failures:

Read key modified source files (use git diff develop...HEAD --stat)
Identify behavioral changes:
- New file types or storage mechanisms
- Changed output formats (diagdb, show heap header, etc.)
- Disabled or stubbed functions
- New error codes or changed error messages
Use explore agents in parallel for deeper code analysis if needed

Step 5: Categorize and Analyze

For each failed TC, determine:

Is it related to the feature? — Match test operations against feature changes
Root cause hypothesis — Why the test fails given the feature changes
Category — Group TCs by shared root cause

Common categories:

Output format mismatch (answer file needs update)
Disabled/stubbed functionality
New storage path not handled by existing tool (unloaddb, copydb, etc.)
Error code changes
Timeout / CI flakiness
Unrelated regression

Step 6: Generate Report

Write a structured markdown report with:

# Failed TC Analysis Report: <branch> (<PR link>)

## Background
(Feature description, key behavioral changes)

## Category N: <Root Cause> (X TCs) — <OOS-related? / Feature-related?>

| # | TC | What it tests | Failure | Related? |
|---|-----|--------------|---------|----------|
| ... | ... | ... | ... | ... |

**Root cause analysis**: ...
**Proposed fix**: ...

## Summary

| Category | Count | Related? | Root Cause |
|----------|-------|----------|------------|
| ... | ... | ... | ... |

## Priority Actions
1. P0: ...
2. P1: ...

Step 7: Save and Present

Save report to failed_tc_report.md in the project root (or user-specified location)
Print a concise summary with counts: X related, Y unrelated, Z total

Output Conventions

Language

Section headers (##): English
Table content and analysis: English (technical report for broad audience)
Code, paths, function names: Keep as-is

Style

Tables for TC listings within each category
Code blocks for call flow diagrams showing broken vs expected paths
Bold for key findings and root causes
Backticks for all code references
Horizontal rules between major sections

Tips

When in doubt about source code behavior, use LSP (clangd) to analyze CUBRID C/C++ code. Use lsp_hover to check types, lsp_goto_definition to trace function implementations, lsp_find_references to understand call sites, and lsp_diagnostics to catch issues. This is especially useful when tracing how a changed function affects downstream callers.
When fetching CircleCI results, use API v1.1 (not v2) since v2 requires authentication. Example: https://circleci.com/api/v1.1/project/github/CUBRID/cubrid/<build_num>/tests works without credentials.
Always read the actual test script AND answer file — the script tells you what operations are tested, the answer tells you what output is expected
Look for data types that exceed storage thresholds (e.g., varchar(20000), large JSON, CLOB/BLOB)
Check for diagdb, show heap header, cubrid spacedb in test scripts — these are sensitive to storage format changes
Check for unloaddb/loaddb/copydb — these require full record resolution
TCs with no diff details from CI may need local reproduction to diagnose
Group by root cause, not by symptom — multiple TCs often share a single underlying issue

Mandatory: Iterate with Grill-and-Revise

How to hand off:

After saving the initial report to failed_tc_report.md, invoke /grill-and-revise with:

Topic & purpose: CI failure analysis for <branch> / <PR link>, audience is the PR author, QA, and CUBRID maintainers
Output path: the same report file (the loop revises in place)
Source material: the failed TC list, CircleCI output, the actual test scripts and answer files, git diff develop...HEAD summary, key modified source files
Review angle: every "Related?" call is justified by a concrete behavioral change, root-cause hypotheses are testable (not hand-wavy), proposed fixes are concrete (file/function-level), categorization groups by underlying cause not symptom, Priority Actions are actionable
Round cap: default 5

analyze-ci-failures

CUBRID CI Test Failure Analyzer

When to Use

Arguments

Inputs

Execution Steps

Step 1: Gather Inputs

Step 2: Fetch CI Failure Details

Step 3: Read All Failed Test Cases (Parallel)

Step 4: Analyze Feature Changes

Step 5: Categorize and Analyze

Step 6: Generate Report

Step 7: Save and Present

Output Conventions

Language

Style

Tips

Mandatory: Iterate with Grill-and-Revise

CUBRID CI Test Failure Analyzer

When to Use

Arguments

Inputs

Execution Steps

Step 1: Gather Inputs

Step 2: Fetch CI Failure Details

Step 3: Read All Failed Test Cases (Parallel)

Step 4: Analyze Feature Changes

Step 5: Categorize and Analyze

Step 6: Generate Report

Step 7: Save and Present

Output Conventions

Language

Style

Tips

Mandatory: Iterate with Grill-and-Revise