Requirements for Outputs
All Excel files
Zero Formula Errors
- - Every Excel model MUST be delivered with ZERO formula errors (#REF!, #DIV/0!, #VALUE!, #N/A, #NAME?)
Preserve Existing Templates (when updating templates)
- - Study and EXACTLY match existing format, style, and conventions when modifying files
- Never impose standardized formatting on files with established patterns
- Existing template conventions ALWAYS override these guidelines
Financial models
Color Coding Standards
Unless otherwise stated by the user or existing template
Industry-Standard Color Conventions
- - Blue text (RGB: 0,0,255): Hardcoded inputs, and numbers users will change for scenarios
- Black text (RGB: 0,0,0): ALL formulas and calculations
- Green text (RGB: 0,128,0): Links pulling from other worksheets within same workbook
- Red text (RGB: 255,0,0): External links to other files
- Yellow background (RGB: 255,255,0): Key assumptions needing attention or cells that need to be updated
Number Formatting Standards
Required Format Rules
- - Years: Format as text strings (e.g., "2024" not "2,024")
- Currency: Use $#,##0 format; ALWAYS specify units in headers ("Revenue ($mm)")
- Zeros: Use number formatting to make all zeros "-", including percentages (e.g., "$#,##0;($#,##0);-")
- Percentages: Default to 0.0% format (one decimal)
- Multiples: Format as 0.0x for valuation multiples (EV/EBITDA, P/E)
- Negative numbers: Use parentheses (123) not minus -123
Formula Construction Rules
Assumptions Placement
- - Place ALL assumptions (growth rates, margins, multiples, etc.) in separate assumption cells
- Use cell references instead of hardcoded values in formulas
- Example: Use =B5(1+$B$6) instead of =B51.05
Formula Error Prevention
- - Verify all cell references are correct
- Check for off-by-one errors in ranges
- Ensure consistent formulas across all projection periods
- Test with edge cases (zero values, negative numbers)
- Verify no unintended circular references
Documentation Requirements for Hardcodes
- - Comment or in cells beside (if end of table). Format: "Source: [System/Document], [Date], [Specific Reference], [URL if applicable]"
- Examples:
- "Source: Company 10-K, FY2024, Page 45, Revenue Note, [SEC EDGAR URL]"
- "Source: Company 10-Q, Q2 2025, Exhibit 99.1, [SEC EDGAR URL]"
- "Source: Bloomberg Terminal, 8/15/2025, AAPL US Equity"
- "Source: FactSet, 8/20/2025, Consensus Estimates Screen"
XLSX creation, editing, and analysis
Overview
A user may ask you to create, edit, or analyze the contents of an .xlsx file. You have different tools and workflows available for different tasks.
Important Requirements
LibreOffice Required for Formula Recalculation: You can assume LibreOffice is installed for recalculating formula values using the recalc.py script. The script automatically configures LibreOffice on first run
Reading and analyzing data
Data analysis with pandas
For data analysis, visualization, and basic operations, use
pandas which provides powerful data manipulation capabilities:
CODEBLOCK0
Excel File Workflows
CRITICAL: Use Formulas, Not Hardcoded Values
Always use Excel formulas instead of calculating values in Python and hardcoding them. This ensures the spreadsheet remains dynamic and updateable.
❌ WRONG - Hardcoding Calculated Values
CODEBLOCK1
✅ CORRECT - Using Excel Formulas
CODEBLOCK2
This applies to ALL calculations - totals, percentages, ratios, differences, etc. The spreadsheet should be able to recalculate when source data changes.
Common Workflow
- 1. Choose tool: pandas for data, openpyxl for formulas/formatting
- Create/Load: Create new workbook or load existing file
- Modify: Add/edit data, formulas, and formatting
- Save: Write to file
- Recalculate formulas (MANDATORY IF USING FORMULAS): Use the recalc.py script
python recalc.py output.xlsx
- 6. Verify and fix any errors:
- The script returns JSON with error details
- If
status is
errors_found, check
error_summary for specific error types and locations
- Fix the identified errors and recalculate again
- Common errors to fix:
-
#REF!: Invalid cell references
-
#DIV/0!: Division by zero
-
#VALUE!: Wrong data type in formula
-
#NAME?: Unrecognized formula name
Creating new Excel files
CODEBLOCK4
Editing existing Excel files
CODEBLOCK5
Recalculating formulas
Excel files created or modified by openpyxl contain formulas as strings but not calculated values. Use the provided recalc.py script to recalculate formulas:
CODEBLOCK6
Example:
CODEBLOCK7
The script:
- - Automatically sets up LibreOffice macro on first run
- Recalculates all formulas in all sheets
- Scans ALL cells for Excel errors (#REF!, #DIV/0!, etc.)
- Returns JSON with detailed error locations and counts
- Works on both Linux and macOS
Formula Verification Checklist
Quick checks to ensure formulas work correctly:
Essential Verification
- - [ ] Test 2-3 sample references: Verify they pull correct values before building full model
- [ ] Column mapping: Confirm Excel columns match (e.g., column 64 = BL, not BK)
- [ ] Row offset: Remember Excel rows are 1-indexed (DataFrame row 5 = Excel row 6)
Common Pitfalls
- - [ ] NaN handling: Check for null values with INLINECODE9
- [ ] Far-right columns: FY data often in columns 50+
- [ ] Multiple matches: Search all occurrences, not just first
- [ ] Division by zero: Check denominators before using
/ in formulas (#DIV/0!) - [ ] Wrong references: Verify all cell references point to intended cells (#REF!)
- [ ] Cross-sheet references: Use correct format (Sheet1!A1) for linking sheets
Formula Testing Strategy
- - [ ] Start small: Test formulas on 2-3 cells before applying broadly
- [ ] Verify dependencies: Check all cells referenced in formulas exist
- [ ] Test edge cases: Include zero, negative, and very large values
Interpreting recalc.py Output
The script returns JSON with error details:
CODEBLOCK8
Best Practices
Library Selection
- - pandas: Best for data analysis, bulk operations, and simple data export
- openpyxl: Best for complex formatting, formulas, and Excel-specific features
Working with openpyxl
- - Cell indices are 1-based (row=1, column=1 refers to cell A1)
- Use
data_only=True to read calculated values: INLINECODE12 - Warning: If opened with
data_only=True and saved, formulas are replaced with values and permanently lost - For large files: Use
read_only=True for reading or write_only=True for writing - Formulas are preserved but not evaluated - use recalc.py to update values
Working with pandas
- - Specify data types to avoid inference issues: INLINECODE16
- For large files, read specific columns: INLINECODE17
- Handle dates properly: INLINECODE18
Code Style Guidelines
IMPORTANT: When generating Python code for Excel operations:
- - Write minimal, concise Python code without unnecessary comments
- Avoid verbose variable names and redundant operations
- Avoid unnecessary print statements
For Excel files themselves:
- - Add comments to cells with complex formulas or important assumptions
- Document data sources for hardcoded values
- Include notes for key calculations and model sections
输出要求
所有Excel文件
零公式错误
- - 每个Excel模型必须交付时零公式错误(#REF!、#DIV/0!、#VALUE!、#N/A、#NAME?)
保留现有模板(更新模板时)
- - 修改文件时,研究并精确匹配现有格式、样式和惯例
- 切勿对已有固定格式的文件强加标准化格式
- 现有模板惯例始终优先于这些指南
财务模型
颜色编码标准
除非用户或现有模板另有说明
行业标准颜色惯例
- - 蓝色文本(RGB: 0,0,255):硬编码输入,以及用户将针对不同场景更改的数字
- 黑色文本(RGB: 0,0,0):所有公式和计算
- 绿色文本(RGB: 0,128,0):从同一工作簿中其他工作表引用的链接
- 红色文本(RGB: 255,0,0):指向其他文件的外部链接
- 黄色背景(RGB: 255,255,0):需要关注的关键假设或需要更新的单元格
数字格式标准
必需格式规则
- - 年份:格式化为文本字符串(例如,2024而非2,024)
- 货币:使用$#,##0格式;始终在表头中指定单位(收入($mm))
- 零值:使用数字格式使所有零值显示为-,包括百分比(例如,$#,##0;($#,##0);-)
- 百分比:默认为0.0%格式(一位小数)
- 倍数:估值倍数(EV/EBITDA、市盈率)格式化为0.0x
- 负数:使用括号(123)而非负号-123
公式构建规则
假设放置
- - 将所有假设(增长率、利润率、倍数等)放置在单独的假设单元格中
- 在公式中使用单元格引用而非硬编码值
- 示例:使用=B5(1+$B$6)而非=B51.05
公式错误预防
- - 验证所有单元格引用是否正确
- 检查范围中的偏移错误
- 确保所有预测期间的公式一致
- 使用边界情况测试(零值、负数)
- 验证无意外循环引用
硬编码文档要求
- - 在单元格中或旁边添加注释(如果在表格末尾)。格式:来源:[系统/文档],[日期],[具体引用],[URL(如适用)]
- 示例:
- 来源:公司10-K年报,2024财年,第45页,收入附注,[SEC EDGAR URL]
- 来源:公司10-Q季报,2025年第二季度,附件99.1,[SEC EDGAR URL]
- 来源:彭博终端,2025年8月15日,AAPL US Equity
- 来源:FactSet,2025年8月20日,共识预期屏幕
XLSX创建、编辑和分析
概述
用户可能要求您创建、编辑或分析.xlsx文件的内容。针对不同任务,您可以使用不同的工具和工作流程。
重要要求
公式重新计算需要LibreOffice:您可以假设已安装LibreOffice,用于使用recalc.py脚本重新计算公式值。该脚本在首次运行时自动配置LibreOffice
读取和分析数据
使用pandas进行数据分析
对于数据分析、可视化和基本操作,使用
pandas,它提供强大的数据处理能力:
python
import pandas as pd
读取Excel
df = pd.read_excel(file.xlsx) # 默认:第一个工作表
all
sheets = pd.readexcel(file.xlsx, sheet_name=None) # 所有工作表作为字典
分析
df.head() # 预览数据
df.info() # 列信息
df.describe() # 统计信息
写入Excel
df.to_excel(output.xlsx, index=False)
Excel文件工作流程
关键:使用公式,而非硬编码值
始终使用Excel公式,而非在Python中计算值并硬编码。 这确保电子表格保持动态和可更新性。
❌ 错误 - 硬编码计算值
python
错误:在Python中计算并硬编码结果
total = df[Sales].sum()
sheet[B10] = total # 硬编码5000
错误:在Python中计算增长率
growth = (df.iloc[-1][Revenue] - df.iloc[0][Revenue]) / df.iloc[0][Revenue]
sheet[C5] = growth # 硬编码0.15
错误:Python计算平均值
avg = sum(values) / len(values)
sheet[D20] = avg # 硬编码42.5
✅ 正确 - 使用Excel公式
python
正确:让Excel计算总和
sheet[B10] = =SUM(B2:B9)
正确:增长率作为Excel公式
sheet[C5] = =(C4-C2)/C2
正确:使用Excel函数计算平均值
sheet[D20] = =AVERAGE(D2:D19)
这适用于所有计算——总计、百分比、比率、差异等。电子表格应能在源数据更改时重新计算。
常见工作流程
- 1. 选择工具:pandas用于数据,openpyxl用于公式/格式
- 创建/加载:创建新工作簿或加载现有文件
- 修改:添加/编辑数据、公式和格式
- 保存:写入文件
- 重新计算公式(如果使用公式则必须执行):使用recalc.py脚本
bash
python recalc.py output.xlsx
- 6. 验证并修复任何错误:
- 脚本返回包含错误详情的JSON
- 如果status为errors
found,检查errorsummary以了解具体错误类型和位置
- 修复已识别的错误并重新计算
- 需要修复的常见错误:
- #REF!:无效的单元格引用
- #DIV/0!:除以零
- #VALUE!:公式中数据类型错误
- #NAME?:无法识别的公式名称
创建新的Excel文件
python
使用openpyxl处理公式和格式
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
wb = Workbook()
sheet = wb.active
添加数据
sheet[A1] = Hello
sheet[B1] = World
sheet.append([Row, of, data])
添加公式
sheet[B2] = =SUM(A1:A10)
格式化
sheet[A1].font = Font(bold=True, color=FF0000)
sheet[A1].fill = PatternFill(solid, start_color=FFFF00)
sheet[A1].alignment = Alignment(horizontal=center)
列宽
sheet.column_dimensions[A].width = 20
wb.save(output.xlsx)
编辑现有的Excel文件
python
使用openpyxl保留公式和格式
from openpyxl import load_workbook
加载现有文件
wb = load_workbook(existing.xlsx)
sheet = wb.active # 或使用 wb[SheetName] 指定工作表
处理多个工作表
for sheet_name in wb.sheetnames:
sheet = wb[sheet_name]
print(f工作表:{sheet_name})
修改单元格
sheet[A1] = New Value
sheet.insert_rows(2) # 在第2行位置插入行
sheet.delete_cols(3) # 删除第3列
添加新工作表
new
sheet = wb.createsheet(NewSheet)
new_sheet[A1] = Data
wb.save(modified.xlsx)
重新计算公式
由openpyxl创建或修改的Excel文件包含公式作为字符串,但不包含计算值。使用提供的recalc.py脚本重新计算公式:
bash
python recalc.py file> [timeoutseconds]
示例:
bash
python recalc.py output.xlsx 30
该脚本:
- - 首次运行时自动设置LibreOffice宏
- 重新计算所有工作表中的所有公式
- 扫描所有单元格以查找Excel错误(#REF!、#DIV/0!等)
- 返回包含详细错误位置和计数的JSON
- 在Linux和macOS上均可运行
公式验证检查清单
确保公式正确工作的快速检查:
基本验证
- - [ ] 测试2-3个样本引用:在构建完整模型前验证它们提取正确的值
- [ ] 列映射:确认Excel列