with one click
spreadsheet-validated-exec
Execute Python scripts for spreadsheets with prerequisite data validation and source accessibility checks
Menu
Execute Python scripts for spreadsheets with prerequisite data validation and source accessibility checks
Delegate tasks to OpenSpace — a full-stack autonomous worker for coding, DevOps, web research, and desktop automation, backed by an extensive MCP tool and skill library. Skills auto-improve through use, reducing token consumption over time. A cloud community lets agents share and collectively evolve reusable skills.
Incremental audio production with duration mismatch handling, adaptive stem extension, and pre-mix alignment verification
Audio production with diagnostic analysis, timecode parsing from documents, and verified export workflow
Incremental audio production with duration alignment handling, per-stem verification, and adaptive extension strategies
Step-by-step audio production with per-stem verification, timing alignment, and incremental quality gates
End-to-end audio production workflow with stems, effects, archiving, and verification
| name | spreadsheet-validated-exec |
| description | Execute Python scripts for spreadsheets with prerequisite data validation and source accessibility checks |
This skill extends direct Python execution for spreadsheet operations by adding a mandatory data validation phase before any processing begins. This prevents wasted iterations on inaccessible data sources and provides clear error documentation when data cannot be accessed.
Use validated direct run_shell with Python scripts for spreadsheet operations when:
openpyxl, pandas, or similar librariesBeyond standard shell_agent limitations, unvalidated data access causes:
Direct run_shell with Python validation is more reliable because it:
.py files first avoids shell_agent parsing issues with heredocsBefore writing any spreadsheet processing code, verify your data sources are accessible:
import os
import requests
from pathlib import Path
# For local files
def verify_local_file(file_path):
path = Path(file_path)
if not path.exists():
raise FileNotFoundError(f"Data file not found: {file_path}")
if not path.is_file():
raise ValueError(f"Path is not a file: {file_path}")
if path.stat().st_size == 0:
raise ValueError(f"Data file is empty: {file_path}")
print(f"✓ Local file verified: {file_path} ({path.stat().st_size} bytes)")
return True
# For remote URLs
def verify_remote_url(url, timeout=30):
try:
# Try HEAD request first (lighter than GET)
response = requests.head(url, timeout=timeout, allow_redirects=True)
if response.status_code == 405: # HEAD not allowed, try GET
response = requests.get(url, timeout=timeout, stream=True)
response.raise_for_status()
# Check content type if available
content_type = response.headers.get('Content-Type', '')
if 'text/html' in content_type and 'excel' not in url.lower():
print(f"⚠ Warning: URL may return HTML, not data file")
print(f"✓ Remote URL verified: {url} (Status: {response.status_code})")
return True
except requests.exceptions.SSLError as e:
print(f"✗ SSL Error: {str(e)}")
return False
except requests.exceptions.ConnectionError as e:
print(f"✗ Connection Error: {str(e)}")
return False
except requests.exceptions.Timeout as e:
print(f"✗ Timeout Error: {str(e)}")
return False
except Exception as e:
print(f"✗ Verification Failed: {str(e)}")
return False
If primary data source is unavailable:
Check alternative locations:
Document the failure:
def log_access_failure(source, error_type, timestamp=None):
from datetime import datetime
if not timestamp:
timestamp = datetime.now().isoformat()
error_log = {
'timestamp': timestamp,
'source': source,
'error_type': error_type,
'alternatives_attempted': [],
'resolution': 'pending'
}
# Save to error log file
import json
with open('data_access_errors.json', 'a') as f:
f.write(json.dumps(error_log) + '\n')
return error_log
Implement fallback strategy:
def get_data_with_fallback(primary_source, fallback_sources):
sources = [primary_source] + fallback_sources
for i, source in enumerate(sources):
print(f"Attempting source {i+1}/{len(sources)}: {source}")
if source.startswith('http'):
if verify_remote_url(source):
return download_data(source)
else:
if verify_local_file(source):
return load_local_data(source)
print(f"Source {source} unavailable, trying next...")
raise Exception(f"All {len(sources)} data sources unavailable")
Before proceeding to spreadsheet operations, confirm:
If any check fails: Do NOT proceed to spreadsheet processing. Report the blocking issue and either:
Recommended Pattern: Validate → Process → Report
# Step 1: Write validation + processing script to file
cat > validated_spreadsheet_process.py << 'EOF'
import sys
import os
from pathlib import Path
import pandas as pd
from openpyxl import load_workbook
# === PHASE 0: VALIDATION ===
def validate_sources():
sources_to_check = [
('input_data.xlsx', 'local'),
# ('https://backup-source.com/data.xlsx', 'remote')
]
validated_sources = []
for source, source_type in sources_to_check:
try:
if source_type == 'local':
if not Path(source).exists():
print(f"ERROR: Local file not found: {source}")
continue
if Path(source).stat().st_size == 0:
print(f"ERROR: Local file is empty: {source}")
continue
validated_sources.append(source)
print(f"✓ Validated: {source}")
except Exception as e:
print(f"ERROR validating {source}: {e}")
if not validated_sources:
print("FATAL: No valid data sources available. Aborting.")
sys.exit(1)
return validated_sources
# === PHASE 1: PROCESSING ===
def process_data(source_file):
# Your spreadsheet operations here
pass
# === PHASE 2: REPORTING ===
def generate_report(success, details):
report = {
'status': 'success' if success else 'failed',
'details': details
}
print(f"Report: {report}")
return report
if __name__ == '__main__':
try:
# Validate first
sources = validate_sources()
# Process validated sources
for source in sources:
process_data(source)
generate_report(True, f"Processed {len(sources)} sources")
except Exception as e:
generate_report(False, str(e))
sys.exit(1)
EOF
# Step 2: Execute the validated script
python3 validated_spreadsheet_process.py
For complex multi-line scripts with multiple data sources:
# Step 1: Write the Python script to a file
cat > process_spreadsheet.py << 'EOF'
import openpyxl
from openpyxl import Workbook
from pathlib import Path
import sys
# Validate first
input_file = Path('file.xlsx')
if not input_file.exists():
print(f"ERROR: Input file not found: {input_file}")
sys.exit(1)
if input_file.stat().st_size == 0:
print(f"ERROR: Input file is empty: {input_file}")
sys.exit(1)
# Your spreadsheet code here
wb = openpyxl.load_workbook('file.xlsx')
# ... operations ...
wb.save('output.xlsx')
print('Success')
EOF
# Step 2: Execute the script
python3 process_spreadsheet.py
For short scripts when NOT using shell_agent, include validation inline:
# Include quick validation before processing
python3 << 'EOF'
import os
import sys
# Quick validation
input_file = 'data.xlsx'
if not os.path.exists(input_file):
print(f"ERROR: {input_file} not found")
sys.exit(1)
if os.path.getsize(input_file) == 0:
print(f"ERROR: {input_file} is empty")
sys.exit(1)
# Continue with processing...
EOF
With prerequisite validation:
import pandas as pd
import sys
from pathlib import Path
# Validate input file exists and has content
input_path = Path('input.xlsx')
if not input_path.exists():
print(f"ERROR: Input file not found: {input_path}")
sys.exit(1)
if input_path.stat().st_size == 0:
print(f"ERROR: Input file is empty: {input_path}")
sys.exit(1)
# Now safe to process
df = pd.read_excel('input.xlsx', sheet_name='Revenue')
# Apply transformations
df['Net_Revenue'] = df['Gross_Revenue'] * (1 - df['Tax_Rate'])
# Save results
df.to_excel('output.xlsx', index=False, sheet_name='Processed')
from openpyxl import load_workbook
import sys
from pathlib import Path
# Validate workbook accessibility
wb_path = Path('tour_data.xlsx')
if not wb_path.exists():
print(f"ERROR: Workbook not found: {wb_path.absolute()}")
sys.exit(1)
try:
wb = load_workbook('tour_data.xlsx')
except Exception as e:
print(f"ERROR: Cannot open workbook: {e}")
print("Possible causes: file locked, corrupted, or wrong format")
sys.exit(1)
# Iterate through sheets
for sheet_name in wb.sheetnames:
ws = wb[sheet_name]
# Apply formatting or calculations
for row in ws.iter_rows(min_row=2, max_col=5):
# Process cells
pass
wb.save('tour_data_processed.xlsx')
from openpyxl import load_workbook
from openpyxl.styles import Font, PatternFill, Alignment
from pathlib import Path
import sys
# Pre-check file
report_path = Path('report.xlsx')
if not report_path.exists():
print(f"ERROR: Report file not found: {report_path}")
sys.exit(1)
wb = load_workbook('report.xlsx')
ws = wb.active
# Apply header styling
header_fill = PatternFill(start_color='4472C4', fill_type='solid')
header_font = Font(bold=True, color='FFFFFF')
for cell in ws[1]:
cell.fill = header_fill
cell.font = header_font
cell.alignment = Alignment(horizontal='center')
wb.save('report_formatted.xlsx')
import sys
import json
from datetime import datetime
from pathlib import Path
from openpyxl import load_workbook
def log_error(source, error_msg, error_type):
"""Log errors for later analysis and reporting"""
error_record = {
'timestamp': datetime.now().isoformat(),
'source': source,
'error_type': error_type,
'message': error_msg
}
with open('processing_errors.log', 'a') as f:
f.write(json.dumps(error_record) + '\n')
return error_record
try:
# Validate file before opening
if not Path('data.xlsx').exists():
log_error('data.xlsx', 'File not found', 'ValidationError')
raise FileNotFoundError('data.xlsx')
wb = load_workbook('data.xlsx')
ws = wb.active
# Your operations here
value = ws['A1'].value
wb.save('output.xlsx')
print(f"Success: Processed {ws.max_row} rows")
except FileNotFoundError as e:
log_error('data.xlsx', str(e), 'FileNotFoundError')
print(f"ERROR: {str(e)}", file=sys.stderr)
print("ACTION: Verify file path and permissions")
sys.exit(1)
except PermissionError as e:
log_error('data.xlsx', str(e), 'PermissionError')
print(f"ERROR: Permission denied - file may be open in another application")
print("ACTION: Close the file in other applications and retry")
sys.exit(1)
except Exception as e:
log_error('data.xlsx', str(e), type(e).__name__)
print(f"ERROR: {str(e)}", file=sys.stderr)
sys.exit(1)
.py file first, then execute via run_shellImmediate actions:
For temporary failures (timeouts, 5xx errors):
def retry_with_backoff(operation, max_retries=3, base_delay=1):
import time
for attempt in range(max_retries):
try:
return operation()
except (TimeoutError, ConnectionError) as e:
if attempt == max_retries - 1:
raise
delay = base_delay * (2 ** attempt)
print(f"Retry {attempt + 1}/{max_retries} after {delay}s delay")
time.sleep(delay)
For permanent failures (file not found, 404):
def report_data_access_issue(issue_type, source, details, alternatives_tried=None):
"""Standardize error reporting for data access failures"""
from datetime import datetime
report = f"""
=== DATA ACCESS FAILURE REPORT ===
Type: {issue_type}
Source: {source}
Time: {datetime.now().isoformat()}
Details: {details}
Alternatives Attempted: {alternatives_tried or 'None'}
Recommended Action: {get_recommended_action(issue_type)}
==================================
"""
print(report)
return report
def get_recommended_action(issue_type):
actions = {
'FileNotFound': 'Verify file path, check if file was moved or deleted',
'ConnectionError': 'Check network connectivity, try alternative source',
'Timeout': 'Increase timeout, check source server status',
'PermissionError': 'Check file permissions, ensure file not open elsewhere',
'SSLError': 'Verify SSL certificate, try HTTP if appropriate',
'default': 'Review error details, check source availability'
}
return actions.get(issue_type, actions['default'])
| Library | Best For | Also Use For |
|---|---|---|
requests | Web requests | Source URL validation, health checks |
openpyxl | Reading/writing .xlsx files, formatting, formulas | - |
pandas | Data manipulation, analysis, merging datasets | Chunked reading for large files |
xlrd | Reading older .xls files (read-only) | - |
xlsxwriter | Creating new .xlsx files with advanced formatting | - |
Issue: Data source unavailable / file not found
Issue: Source was accessible yesterday but not today
Issue: Validation passes but processing fails
Issue: Heredoc syntax fails with 'unknown error' when using shell_agent
.py file first with full validation logic, then execute with python3 script.py. This is significantly more reliable than inline heredoc when shell_agent is the executor.Issue: FileNotFoundError (after validation passed)
Issue: PermissionError
lsof | grep filenameIssue: ConnectionError / Timeout on remote sources
Issue: SSL Certificate errors
requests.get(url, verify=False) - but log this as a security concernIssue: All alternative sources fail
Issue: MemoryError on large files
chunksize parameter. Validate chunk size during initial validation phase.Issue: Formatting not applying
.copy() for style objects. Verify workbook is not in read-only mode.