name	architect
description	Expert guidance for GabeDA v2.1 architecture (34 modules) - implementing models, features, debugging 4-case logic, and maintaining the /src codebase.
version	2.1.0

GabeDA Architecture Expert

Purpose

This skill provides expert guidance for the GabeDA v2.1 refactored architecture. It focuses on implementing models, adding features, debugging execution logic, and maintaining architectural principles across the 34-module /src codebase.

Core Expertise:

/src architecture (34 modules in 6 packages)
4-case logic execution engine
Feature implementation (filters, attributes, aggregations)
Dependency resolution and data flow
External data integration patterns
Frontend development (React + TypeScript + Vite)
Testing strategies and validation

When to Use This Skill

Invoke this skill when:

Working with the /src refactored codebase (v2.1)
Implementing new aggregation models (daily, weekly, monthly, customer, product)
Adding filters, attributes, or computed features
Debugging 4-case logic execution issues
Configuring external data joins
Developing frontend features (React + TypeScript + Vite)
Troubleshooting blank pages or HMR issues
Understanding data flow and persistence strategies
Troubleshooting column naming or dependency resolution
Ensuring architectural principles are maintained
Creating tests in /test folder

NOT for: Business strategy, marketing content, data analysis notebooks (delegate to business, marketing, insights skills)

Quick Start

Essential Documents:

📚 Feature Implementation Guide - PRIMARY GUIDE for implementation
📖 Documentation Master Index - Central hub for all documentation
🧪 Test Manifest - Complete test catalog (197 tests)
📝 Documentation Guidelines - Before creating any docs

Key References:

references/module_reference.md - 34 modules structure
references/4_case_logic.md - Critical execution engine
references/external_data_integration.md - Column naming rules

Core Architecture Overview

Module Structure (v2.1)

34 modules in 6 packages following Single Responsibility Principle:

src/
├── utils/         # Utilities (7 modules) - 88 tests, 92% coverage
├── core/          # Core infrastructure (5 modules)
├── preprocessing/ # Data preparation (5 modules)
├── features/      # Feature management (4 modules)
├── execution/     # Feature computation (5 modules) - Includes 4-case logic
└── export/        # Output generation (2 modules)

For complete module details: See references/module_reference.md

Data Flow Pipeline

CSV → DataLoader → SchemaProcessor → SyntheticEnricher →
FeatureStore → DependencyResolver → ModelExecutor → ExcelExporter

For detailed flow stages: See references/data_flow_pipeline.md

Critical: 4-Case Logic

The GroupByProcessor (src/execution/groupby.py) implements single-loop execution with 4 cases:

Case 1: Standard filter (reads data_in only)
Case 2: Filter using attributes (reads data_in + agg_results) KEY INNOVATION
Case 3: Attribute with aggregation
Case 4: Attribute composition (uses only other attributes)

Case 2 Example:

def price_above_avg(price_total: float, prod_price_avg: float) -> bool:
    """Filter that uses an attribute as input"""
    return price_total > prod_price_avg

For deep dive: See references/4_case_logic.md

Core Workflows

Workflow 1: Implementing a New Model

When creating daily, weekly, monthly, customer, or product aggregation models:

Read primary guide - Feature Implementation Guide
Define features - Create filter and attribute functions with type hints
Create features dictionary - Register all features
Configure model - Set group_by, external_data, output_cols
Verify naming - Check external column prefixes (join keys NOT prefixed, others ARE)
Test execution - Verify output shapes and values
Create tests - Add repeatable tests in /test folder

Detailed guide: assets/examples/implementing_new_model.md

Working examples:

02_1_week.ipynb - Weekly model with external data
01_1_1_day.ipynb - Daily aggregation
03_consolidated_all_models.ipynb - 9-model pipeline

Workflow 2: Adding a New Feature

When adding filters (row-level) or attributes (aggregated):

Define function - Include type hints and docstring
Determine type - Filter (vectorized) or attribute (aggregated)?
Register in dictionary - Add to features dict
Check dependencies - Ensure resolvable via DFS
Verify external data - If using, check column naming
Update model config - Add to output_cols
Create tests - Add to /test folder with sample data

Detailed guide: assets/examples/adding_new_feature.md

For feature type details: See references/feature_types.md

Workflow 3: Configuring External Data

When joining external datasets (daily → weekly, customer → product):

Verify dataset exists - Check ctx.list_datasets()
Configure in model - Add external_data section with source, join_on, columns
Remember naming rules:
- Join keys: NOT prefixed (e.g., dt_date stays dt_date)
- Regular columns: ARE prefixed (e.g., price_total_sum → daily_attrs_price_total_sum)
Write feature functions - Use correct prefixed names
Test join - Verify merged data has expected columns

Critical naming table:

Column Type	Original	After Merge	Prefixed?
Join key	`dt_date`	`dt_date`	❌ NO
Regular column	`price_total_sum`	`daily_attrs_price_total_sum`	✅ YES

Detailed guide: assets/examples/configuring_external_data.md

For complete naming rules: See references/external_data_integration.md

Workflow 4: Debugging Execution Issues

When encountering errors during model execution:

Check error message - "Argument 'X' not found" most common
Verify column naming - Join keys vs regular columns prefixes
Validate external data - Check dataset exists and join_on matches
Print available columns - Use ctx.get_dataset('name').columns.tolist()
Test incrementally - Add features one at a time
Check dependencies - Ensure DFS can resolve order

Common error: "Argument not found"

Causes:

External column wrong prefix (join key vs regular)
Missing external data config
Typo in column name

Solution:

# 1. Check input dataset
print(ctx.get_dataset('transactions_filters').columns.tolist())

# 2. Check external dataset
print(ctx.get_dataset('daily_attrs').columns.tolist())

# 3. Remember: Join keys NO prefix, others WITH prefix

For complete troubleshooting: See references/troubleshooting.md

Workflow 5: Frontend Development (React/Vite)

CRITICAL: Always clean dev environment BEFORE starting new features

When working on frontend features (GabeDA Dashboard - React + TypeScript + Vite):

Step 0: Clean Dev Environment (MANDATORY)

# Kill all node processes to avoid port conflicts and HMR corruption
cd C:/Projects/play/gabeda_frontend
taskkill //F //IM node.exe

# Clear Vite cache
rm -rf node_modules/.vite

# Start fresh dev server
npm run dev

Why This Matters:

Problem: Multiple Vite HMR instances can run simultaneously on different ports (5173, 5174, 5175...)
Symptom: Blank pages, "module does not provide export" errors, stuck/corrupted state
Root Cause: Old dev servers hold corrupted module cache, new changes start on different port
Solution: Kill ALL node processes before starting work

Quick Fix Script: Use restart-dev.bat in frontend folder:

@echo off
taskkill //F //IM node.exe 2>nul
if exist node_modules\.vite rmdir /s /q node_modules\.vite
npm run dev

Development Workflow:

CLEAN - Run restart-dev.bat or kill node processes manually
BRANCH - Create feature branch (git checkout -b feature/feature-name)
IMPLEMENT - Make code changes
BUILD - Run npm run build to check for TypeScript errors
TEST - Test locally on http://localhost:5173 (verify correct port!)
E2E - Use Playwright skill for automated testing
COMMIT - Only after local verification passes
DEPLOY - Merge to main → auto-deploy to Render

Common Issues:

Blank pages on all routes → Multiple Vite instances running, kill all node processes
"Module does not provide export" errors → HMR cache corruption, clear .vite cache
Wrong port (5174, 5175 instead of 5173) → Old servers still running, kill and restart
Changes not appearing → Browser accessing old port, hard refresh (Ctrl+Shift+R) or use incognito

Port Detection:

# Check which port Vite started on (look for "Local: http://localhost:XXXX")
npm run dev

# If not 5173, there are stuck processes - kill and restart

Best Practices:

Always kill node processes before starting new feature work
Always verify you're accessing the correct port (check terminal output)
Always use incognito/private browsing for testing to avoid browser cache issues
Always build (npm run build) before committing to catch TypeScript errors
Never commit without local testing on the correct port

Core Principles (DO NOT BREAK)

✅ Single Responsibility - Each module does ONE thing ✅ Single Input - Each model gets exactly 1 dataframe ✅ DFS Resolution - Features auto-ordered by dependencies ✅ 4-Case Logic - Filters can use attributes as inputs ✅ Immutable Context - User config never changes during execution ✅ Save Checkpoints - Save after every major transformation ✅ Type Annotations - All functions have type hints ✅ Logging - Every module uses get_logger(name) ✅ Testing - All tests MUST be in /test folder and be repeatable

For detailed principles: See references/core_principles.md

Testing Requirements

Current Statistics:

Total Tests: 197 tests (6 integration, 108 unit, 69 validation, 14 notebook)
Code Coverage: 85% (target: ≥85%)
Test Manifest: ai/testing/TEST_MANIFEST.md ⭐ Living Document

Test Rules:

Location: All tests MUST be in /test folder
Repeatability: Tests MUST be idempotent (run multiple times, same result)
Cleanup: Tests MUST delete temp files/folders
Independence: No external state dependencies
Naming: Use test_{module_name}.py or test_{feature_name}.py
Documentation: ALWAYS append to Test Manifest

Running Tests:

pytest test/              # All tests
pytest test/unit/         # Unit tests only
pytest test/integration/  # Integration tests only
pytest test/ -v           # With verbose output

For complete testing guidelines: See references/testing_guidelines.md

Configuration Patterns

Base Config:

base_cfg = {
    'input_file': 'path/to/data.csv',
    'client': 'project_name',
    'analysis_dt': 'YYYY-MM-DD',
    'data_schema': {
        'in_dt': {'source_column': 'Fecha venta', 'dtype': 'date'},
        'in_product_id': {'source_column': 'SKU', 'dtype': 'str'},
        'in_price_total': {'source_column': 'Total', 'dtype': 'float'}
    }
}

Model Config (With External Data):

cfg_model = {
    'model_name': 'weekly',
    'group_by': ['dt_year', 'dt_weekofyear'],
    'row_id': 'in_trans_id',
    'output_cols': list(features.keys()),
    'features': features,
    'external_data': {
        'daily_attrs': {
            'source': 'daily_attrs',
            'join_on': ['dt_date'],
            'columns': None  # None = ALL, or ['col1', 'col2']
        }
    }
}

For complete patterns: See references/configuration_patterns.md

Additional Resources

Reference Documentation

module_reference.md - 34 modules structure with coverage stats
data_flow_pipeline.md - 7-stage pipeline flow
4_case_logic.md - Critical execution engine ⭐ KEY INNOVATION
feature_types.md - Filters vs attributes
dependency_resolution.md - DFS traversal
configuration_patterns.md - Config templates
external_data_integration.md - Column naming rules
synthetic_enrichment.md - Auto-infer 17 columns
testing_guidelines.md - Test requirements (197 tests)
troubleshooting.md - Common error patterns
core_principles.md - 9 DO NOT BREAK rules

Implementation Examples

implementing_new_model.md - Step-by-step model creation
adding_new_feature.md - Filter and attribute addition
configuring_external_data.md - External joins
adding_aggregation_level.md - New aggregation levels

External Documentation

Feature Implementation Guide - PRIMARY REFERENCE
Documentation Master Index - All guides
Module Reference - Technical module docs
Model Specifications - Tech specs, aggregation architecture

Integration with Other Skills

From Business Skill

Receive: User stories, acceptance criteria, priority rankings, business requirements
Provide: Technical feasibility assessment, effort estimates, architecture proposals
Example: Business defines "VIP customer retention" → Architect implements RFM model

From Executive Skill

Receive: Feature requirements, quality standards, timeline constraints
Provide: Implementation plans, trade-off analysis, technical specs
Example: Executive prioritizes Chilean launch → Architect implements CLP currency support

To Insights Skill

Provide: Available features, data schema, execution capabilities
Receive: Notebook requirements, visualization needs, metric definitions
Example: Architect adds RFM model → Insights creates VIP retention notebook

To Marketing Skill

Provide: Technical capabilities, feature descriptions, performance metrics
Receive: Feature positioning requirements, technical content needs
Example: Architect implements 4-case logic → Marketing positions as "KEY INNOVATION"

Living Documents (Append Only)

When making changes, ALWAYS append to these 9 living documents:

Document	When to Use
CHANGELOG.md	After modifying any `.py` file
ISSUES.md	After fixing bugs or errors
PROJECT_STATUS.md	Weekly updates
FEATURE_IMPLEMENTATIONS.md	After implementing features
TESTING_RESULTS.md	After running tests
TEST_MANIFEST.md	When adding/modifying tests ⭐
ARCHITECTURE_DECISIONS.md	When making architectural choices
NOTEBOOK_IMPROVEMENTS.md	When improving notebooks
FUTURE_ENHANCEMENTS.md	When proposing enhancements

Documentation Workflow:

Check if change fits into one of these 9 living documents
If YES → APPEND to that document (do NOT create new file)
If NO → Check Documentation Guidelines
NEVER create documentation files without checking guidelines first

Working Directory

Architect Workspace: .claude/skills/architect/

Bundled Resources:

references/ - 11 technical reference documents (module structure, 4-case logic, external data, testing, troubleshooting, core principles)
assets/examples/ - 4 implementation guides (new model, new feature, external data, aggregation level)

Technical Documents (Create Here):

/ai/architect/ - Architecture proposals, spike results, design documents
Use descriptive names: integration_analysis.md, feature_implementation_guide.md

Context Folders (Reference as Needed):

/ai/backend/ - Django backend context
/ai/frontend/ - React frontend context
/ai/specs/ - Technical specifications (context, edge cases, feature store, model specs)

When Suggesting Changes

Always explain:

Why - Maintains architectural integrity
Which modules - Affected components
How - Fits into data flow
Where - Data persistence location
What testing - Required in /test folder
How repeatable - Test idempotency strategy

For every change:

Identify implementation files
Create corresponding test in /test folder
Ensure tests are repeatable and self-contained
Use sample data from data/tests/ when needed
Document test execution in code comments
Append to Test Manifest when adding tests

Think like an architect: Prioritize maintainability, testability, and adherence to established patterns.

Version History

v2.1.0 (2025-10-30)

Refactored to use progressive disclosure pattern
Extracted detailed content to references/ (11 files) and assets/examples/ (4 files)
Converted to imperative form (removed second-person voice)
Reduced from 576 lines to ~295 lines
Enhanced with v2.1 utils package details (7 utility modules)
Added clear workflow sections with examples

v2.0.0 (2025-10-28)

Updated for v2.1 architecture (34 modules, 6 packages)
Added comprehensive testing guidelines
Enhanced external data integration documentation

Last Updated: 2025-10-30 Architecture Version: v2.1 (34 modules in 6 packages) Test Coverage: 197 tests, 85% coverage Core Innovation: 4-case logic engine (filters can use attributes as inputs)

name	architect
description	Expert guidance for GabeDA v2.1 architecture (34 modules) - implementing models, features, debugging 4-case logic, and maintaining the /src codebase.
version	2.1.0