| name | cxas-cuj-report-generator |
| description | Automates the ingestion of customer requirement documents such as diagrams, BRDs, code etc., synthesizes high-fidelity natural transcripts, and compiles them into highly interactive, responsive Critical User Journey (CUJ) reports. |
Critical User Journey (CUJ) Transcript & Report Generator Skill
Use this skill when asked to extract dialogue transcripts or compile interactive
Critical User Journey (CUJ) reports from a directory of customer requirement
documents (such as diagrams, BRDs, code etc.).
Core Protocols
To ensure 100% coverage and zero data loss, you MUST follow these core rules:
- Robust Extraction: Follow the protocol defined in the
cxas-protocol-robust-extraction skill.
- Two-Phase Ingestion: Follow the protocol defined in the
cxas-protocol-two-phase-ingestion sub-protocol inside
protocols/cxas-protocol-two-phase-ingestion/.
- Checklist Mandate: The orchestrator and all subagents MUST follow the
agent-protocol-checklist protocol to maintain a local
task_checklist.json file, ensuring they track their progress andFbas do
not lose coverage during execution.
- Auditing: The orchestrator MUST periodically check the subagent's
scratch directory to ensure the
task_checklist.json file is being created
and maintained. If the file is missing or not updated, the orchestrator MUST
terminate the subagent and respawn it with stronger enforcement
instructions.
Core Workflow Steps
Follow this 5-step structured workflow to execute the task:
-
Scoping & Type Discovery: Prepare the environment and identify required
skills.
-
Access Files: Ensure you have access to the source artifacts in your
local workspace.
- Tip (Drive Links): If the source is a Google Drive link or
folder ID, you MUST use the
gdrive skill to access them.
-
Detect Inventory Types: To identify framework signatures and map
them to correct Ingestors, you MUST use the framework detector agent
defined in agents/framework_detector.md. Using this agent, scan the
input files to inventory all file extensions and detect potential
frameworks. Spawn parallel Framework Detector subagents to scan
partitions of the file tree.
-
Map Ingestors: Use the scoping report generated by the Framework
Detector to select or create the correct specialized skills in
ingestors/frameworks/ or ingestors/files/.
- Precedence Rule: Framework-specific ingestors take precedence
over generic file-extension ingestors (e.g., use
ingestors/frameworks/adk/ instead of ingestors/files/py/ if both
apply).
-
Discovery: Spawn specialized expert subagents based on the discovered
types to identify sub-intents (see the agents/ directory for role
definitions). Dynamically discover and use specialized ingestor skills in
ingestors/frameworks/ and ingestors/files/.
-
Mandatory Handoff: Subagents MUST report back:
- Frameworks detected,
- File types parsed, and
- Any files/patterns skipped as out-of-scope.
-
Exhaustive Use: Use all relevant ingestors by applying the most
specific one applicable to each file.
-
Fallback: If no specialized ingestor exists for an out-of-scope file
type, the orchestrator MUST delegate the analysis:
- Spawn Analyzer: Spawn a specialized Analysis Subagent to
inspect a sample of the unknown file.
- Research: Instruct the subagent to search online or in internal
documentation for format standards if the structure is not clear.
- Report & Codify: The subagent must report the best parsing
strategy back to the orchestrator and SHOULD attempt to create a new
specialized skill in
ingestors/frameworks/ or ingestors/files/
to capture this knowledge.
-
Exhaustion: Loop until no new intents are found.
-
Clustering: Group into Parent CUJs. To ensure consistent and accurate
category discovery:
- Noise Reduction: Do NOT pass full objects with raw transcripts or
code.
- Summary Format: Provide a clean YAML list with
id, name
(stripped of technical tags), and a 1-sentence synthesized intent.
- Guidance: Instruct the agent that a reasonable number of categories
is typically between 5 and 10.
-
Execution: Generate transcripts and reports using the tools in this
directory.
- Mandatory: Limit batch sizes to 5-10 items per subagent to prevent
LLM context exhaustion and truncation.
- Title Synthesis: For each transcript, the agent MUST synthesize a
short, human-readable scenario title based on the dialogue content and
the title of the CUJ and store it in the
subintent_name field, rather
than using raw technical IDs.
- Immediate ID Verification: Always assume that sensitive numbers like
Account Number or Order ID are checked in a backend system immediately
after being provided by the user, and insert a
webhook_call or
tool_call accordingly.
- User-First Transcripts: Transcripts MUST always start with a User
utterance. Trim any leading Agent greetings or prompts from the
beginning of the transcript.
- Dual Reports: The agent MUST generate both a CUJ report (limiting
examples to at most 3) AND a comprehensive full report (including all
examples).
- Usage: Run
construct_report.py with --cuj_report=True to
generate the CUJ report, and with --cuj_report=False to generate the
comprehensive full report.
Autonomous Execution Guardrails
By default, this workflow is long-running and requires autonomous execution. You
MUST follow these guardrails:
- Automatic Watchdog: Upon starting the task, you MUST automatically
schedule a recurring timer (e.g., every 5 minutes using the
schedule tool)
to interrupt and check for stuck subagents or tasks.
- Initial Confirmation: In your very first response to the user, you MUST
explicitly state that you are applying the Robust Extraction Protocol and
that you have set a watchdog timer.
- Dynamic Bisecting: If a batch fails the Verification Gate twice due to
missing items, automatically bisect the batch and spawn two parallel
subagents to handle the smaller load.
Core Schema
All generated transcripts MUST adhere to the
resources/schemas/transcript_schema.yml contract:
subintent_id: A unique slug.
subintent_name: Human-readable name.
parent_cuj: The high-level category.
turns: A list of dialogue objects.
Dialogue Turn Requirements
- Speaker: Must be either
Agent or User. Please ensure that function
call turn comes immediately after a user turn.
- Text: The literal string spoken.
- Enrichment:
intent_detected: Specify the NLU intent if applicable.
tool_call: Use when the agent invokes a local function.
webhook_call: Use when the agent triggers an external API.
system_action: Use for state transitions or background logic.
Execution Phase Details
During the Execution phase, subagents MUST NOT write directly to the
transcript files.
- Generate a small YAML file containing the data for a single turn.
- Pass it to the
append_turn.py script to build the transcript
incrementally.
- Once all batches are verified, run
construct_report.py to generate the
final interactive HTML report.
Mandatory Subagent Prompting: When spawning subagents for batch execution,
the orchestrator MUST include this instruction in their prompt:
"You must use append_turn.py for every turn. Do not summarize the dialogue.
Generate a full, natural conversation for every item in your batch."