| name | xlsx |
| description | Use this skill to author or edit Microsoft Excel .xlsx files. Covers building workbooks with formulas, editing existing files, recalculating formulas, and scanning for #REF!/#DIV/0!/#VALUE! errors. For READING existing .xlsx files, use the harness `read` tool — do not invoke this skill. Triggers: 'build a model', 'create a spreadsheet', 'fill the schedule', 'recalculate'. Does NOT apply to .pdf, .docx, .pptx, or .xls (legacy Excel). |
XLSX authoring and editing
Reading is not in scope. To read an existing .xlsx, use the harness read tool (pandas extracts every sheet as a markdown table). This skill is for writing, editing, and recalculating.
Quick reference
| Goal | Use |
|---|
| Build a workbook from scratch | openpyxl directly, or scripts/build_workbook.py for banker conventions |
| Edit cells in an existing file | openpyxl.load_workbook(...) → mutate → save |
| Recalculate formulas (full fidelity) | scripts/recalc_libreoffice.py |
| Recalculate formulas (no LibreOffice) | scripts/recalc_pure_python.py |
| Scan for formula errors | scripts/scan_errors.py |
| Validate before delivery | scripts/validate.py |
Banker conventions (mandatory for financial models)
Apply these to every workbook unless the task explicitly overrides:
- Inputs are blue, formulas are black, cross-sheet references are green, external links are red. Use
Font(color='0000FF') etc.
- Negatives in parentheses, not minus signs. Use number format
#,##0;(#,##0).
- Red negatives in P&L tables. Use
#,##0;[Red](#,##0).
- Accounting format for currency.
_-* #,##0_-;-* #,##0_-;_-* "-"_-;_-@_- (or the localized equivalent).
- Multiples shown as
0.0x, not 0.0 followed by an "x" character. Format: 0.0"x".
- Underline-only on totals, not bold-and-underline. Use
Border(bottom=Side(style='thin')).
- No merged cells in input ranges. Merged cells break formulas that reference them; reserve merging for headers and titles only.
- Units in adjacent cells, not in the cell with the value.
($M) next to the value, not "$1,234M" as a string.
scripts/build_workbook.py applies these conventions automatically given a JSON spec.
Formula authoring
- Always emit formulas, never calculated values. If the user wants
revenue × growth, write =B2*C2, not 1234.56. The recalc step materializes values.
- Use named ranges for cross-sheet inputs.
wb.defined_names["assumptions"] = DefinedName(...). Easier to audit.
- Document units in adjacent cells so the model is self-explanatory.
- No volatile functions in hot paths.
OFFSET, INDIRECT, NOW, TODAY recalculate on every change and slow large workbooks.
Recalculation — choose your engine
openpyxl writes formula strings; it does not evaluate them. You must recalculate before delivery, otherwise consumers will see =B2*C2 literal text where they expect numbers (in some readers) or stale cached values (in others).
LibreOffice path (recalc_libreoffice.py) — ground truth:
python scripts/recalc_libreoffice.py input.xlsx output.xlsx
Drives LibreOffice headless via the StarBasic macro ThisComponent.calculateAll(); ThisComponent.store(). Slow (~5–10s per workbook) but matches Excel for nearly every function. Use this when the workbook contains modern Excel features.
Pure-Python path (recalc_pure_python.py) — fast, partial:
python scripts/recalc_pure_python.py input.xlsx output.xlsx
Uses xlcalculator to evaluate every formula in pure Python. Fast (~0.5s per workbook). Covers ~80% of common functions: arithmetic, SUM, IF, VLOOKUP, INDEX/MATCH, basic string/date functions.
Does NOT support: XLOOKUP, LET, dynamic arrays (FILTER, SEQUENCE, UNIQUE), LAMBDA, BYROW, TEXTJOIN with refs, structured table references, most modern (post-2019) Excel features.
If you used any of those, run the LibreOffice path. The pure-Python path is for CI environments without LibreOffice.
Error scan
After every recalc, scan for formula errors:
python scripts/scan_errors.py output.xlsx > errors.json
Reports every cell whose computed value matches #REF!, #DIV/0!, #VALUE!, #NAME?, #NULL!, #NUM!, or #N/A. Output is JSONL with {sheet, address, value} per line.
If errors exist, fix and re-recalc. Don't ship a workbook with #REF!s — it's the most common reason a deliverable fails QA.
Validation gate
scripts/validate.py output.xlsx schema-validates against ECMA-376 SpreadsheetML XSDs and confirms ZIP integrity, content-type registration, sheet relationships.
Out of scope
- Reading: use the
read tool.
- PivotTables —
openpyxl round-trips existing pivots but cannot create or modify them. If a task requires pivot creation, escalate (Windows-only via COM, not portable).
- DAX measures and Power Pivot —
openpyxl can't write these. Same limitation.
- VBA macros (
.xlsm) — out of scope for this skill. Macros require Excel runtime.
- Conditional formatting beyond simple cell-value rules —
openpyxl supports basic CF; complex rules (top-N, data bars across sheets) often fail to round-trip.
- Charts beyond the python-pptx-equivalent set — line/bar/scatter work; combo charts and trendlines round-trip unreliably.