// Compare AI-generated estimations with actual company PDFs to identify price differences and new items. Use when users need to validate estimations against real company quotes.
| name | Comparative Estimation Study |
| description | Compare AI-generated estimations with actual company PDFs to identify price differences and new items. Use when users need to validate estimations against real company quotes. |
Compare AI-generated baseline estimations with actual company PDFs to identify pricing differences and detect new items.
Use this skill when the user asks about:
def process_comparative_study(file_urls: List[str], output_path: str) -> str:
"""
Compare baseline JSON estimation with company PDF quotes
Matching Rules:
- Two-layer designation matching: string comparison โ semantic similarity (embeddings)
- Pre-filters: number + unit + quantity must match
- is_new = False: For all items that exist in baseline JSON
- is_new = True: ONLY for items in PDF with 'number' not in baseline
- Prices updated if different or if baseline has null values
- Section totals automatically recalculated
Args:
file_urls: List of URLs (first must be JSON baseline, rest are PDFs)
output_path: Path to save output JSON
Returns:
str: Path to output JSON file with is_new flags (includes UUID in filename)
Raises:
ValueError: If no JSON or no PDFs found in URLs
Exception: For download or processing errors
"""
# Compare baseline JSON with company PDFs
python comparative_study.py \
--file-urls https://storage.com/baseline.json https://storage.com/quote1.pdf https://storage.com/quote2.pdf \
--output-path temp_files/temp_project_123/comparison_result.json
# Note: The script automatically appends a UUID to the output filename
# Example output: comparison_result_a1b2c3d4-5678-90ab-cdef-1234567890ab.json
โฑ๏ธ Important: When using this skill programmatically (not CLI), ensure timeout is set to at least 3 minutes to allow sufficient time for PDF extraction and semantic matching operations.
{
"projectId": "123",
"projectName": "Construction Project",
"pricingLines": [
{
"type": "line",
"number": "1.1",
"designation": "Installation de chantier",
"unit": "F",
"quantity": "1",
"unitPrice": 24000,
"totalPrice": 24000
}
]
}
Company estimate documents with pricing tables containing:
Pre-filters (must pass first):
Layer 1 - String Matching (Fast):
Layer 2 - Semantic Matching (Fallback):
Updated JSON with is_new field added to each pricing line:
{
"projectId": "123",
"pricingLines": [
{
"type": "line",
"number": "1.1",
"designation": "Installation de chantier",
"unit": "F",
"quantity": "1",
"unitPrice": 24000,
"totalPrice": 24000,
"is_new": false
},
{
"type": "line",
"number": "1.2",
"designation": "Signalisation de chantier",
"unit": "F",
"quantity": "1",
"unitPrice": 5000,
"totalPrice": 5000,
"is_new": true
}
]
}
# Required for PDF extraction
LLAMA_CLOUD_API_KEY=llx-... # For LlamaExtract PDF parsing
# Required for semantic similarity matching
VOYAGE_API_KEY=pa-... # For Voyage AI embeddings
from comparative_study import process_comparative_study
file_urls = [
"https://storage.com/baseline.json",
"https://storage.com/company_quote1.pdf",
"https://storage.com/company_quote2.pdf"
]
output_path = "temp_files/temp_project_456/comparison.json"
result = process_comparative_study(file_urls, output_path)
# Returns: "temp_files/temp_project_456/comparison_<uuid>.json"
# Example: "temp_files/temp_project_456/comparison_a1b2c3d4-5678-90ab-cdef-1234567890ab.json"