Math Solver Skill
A comprehensive skill for solving math problems from images with formula extraction, rendering, and guided learning.
Quick Start
Use Case 1: Image to Solution
CODEBLOCK0
Use Case 2: LaTeX to PNG
CODEBLOCK1
Use Case 3: Problem Batch Processing
User: "Extract all formulas from my homework photos"
Flow: Multiple images → OCR all → Extract LaTeX → Render PNG grid
Core Workflow
1. Image Recognition & OCR
When user uploads math problem image(s):
- - Use PaddleOCR Document Parsing skill to extract text and formulas
- Identify mathematical expressions, layout, and problem structure
- Output structured markdown with extracted content
Supported formats: JPG, PNG, BMP, TIFF, PDF
2. LaTeX Extraction & Normalization
- - Convert recognized formulas to proper LaTeX syntax
- Handle common variations (fractions, exponents, subscripts, Greek letters)
- Validate LaTeX syntax before rendering
- Preserve mathematical meaning and formatting
Examples of auto-corrected formats:
CODEBLOCK3
3. Formula Rendering to PNG
Use
math-images skill to convert LaTeX to high-quality PNG:
- - Support multiple themes (light, dark, sepia, chalk)
- Configurable text sizes and colors
- Batch rendering for multiple formulas
- DPI optimization for different use cases
Output file naming: INLINECODE1
4. Guided Solution Modes
Mode A: Socratic (Exploratory Learning)
Strategy: Ask guiding questions, don't reveal answers directly
Process:
- 1. Problem Analysis - "What does this problem ask you to find?"
- Concept Check - "Which mathematical concept applies here?"
- Method Hint - "What approach would you use?"
- Progress Check - "Show me your work so far. What's the next step?"
- Verification - "Does your answer make sense?"
When to use: Student wants to learn, not just get answers
Mode B: Detailed Explanation
Strategy: Full step-by-step solution with reasoning
Process:
- 1. Problem Understanding - Restate the problem in clear terms
- Formula Extraction - Identify relevant mathematical formulas
- Step-by-Step Solution - Each step with LaTeX rendering and explanation
- Verification - Check answer validity
- Alternative Methods - Show other approaches if applicable
When to use: Student wants complete understanding
Mode C: Quick Answer (Minimum Guidance)
Just the answer with brief verification
Configuration Options
Visual Themes
CODEBLOCK4
Rendering Options
formula_size: "medium" # small, medium, large
dpi: 300 # 150 (fast), 300 (quality), 600 (print)
background_transparent: false
include_explanation: true # Show formula explanation below
Supported Mathematical Domains
Algebra
- - Linear equations: INLINECODE2
- Quadratic equations: INLINECODE3
- Polynomial operations
- Rational expressions
- Systems of equations
Geometry
- - Area and perimeter formulas
- Trigonometric relationships
- Vector operations
- 3D transformations
Calculus
- - Limits and continuity
- Derivatives and integrals
- Series and sequences
- Differential equations
Statistics
- - Mean, variance, standard deviation
- Probability distributions
- Hypothesis testing
- Regression analysis
Linear Algebra
- - Matrix operations
- Eigenvalues and eigenvectors
- Linear transformations
- Determinants
Input & Output Examples
Example 1: Simple Fraction Problem
Input Image: Photo of "Simplify: (1/2) + (1/3)"
Output:
CODEBLOCK6
Example 2: Complex Expression
Input: Handwritten quadratic equation image
LaTeX Extraction:
CODEBLOCK7
Themed Renderings:
- - Light theme PNG
- Dark theme PNG
- Chalk board theme PNG
API Integration
Internal Integrations
- - PaddleOCR Document Parsing - Image to text/formula extraction
- math-images - LaTeX to PNG rendering
- Claude API - Problem analysis and guidance generation
External Support
- - Mathpix API (optional) - High-accuracy formula recognition
- Wolfram Alpha API (optional) - Answer verification
Limitations & Edge Cases
Known Limitations
- 1. Handwriting Quality - Works best with clear, print-like handwriting
- Complex Diagrams - May struggle with embedded geometry diagrams
- Incomplete Problems - Needs full problem statement in image
- Multiple Languages - Optimized for English/Chinese; others may vary
Error Handling
- - If OCR confidence < 70%, ask user to retake clearer photo
- If LaTeX syntax invalid, suggest manual correction
- If problem type unsupported, suggest alternative approaches
User Preferences
When user specifies preferences, save and apply consistently:
CODEBLOCK8
Troubleshooting
"OCR failed to recognize formula"
- - Request clearer image
- Try different angle/lighting
- Manually paste LaTeX if available
"PNG rendering looks wrong"
- - Check LaTeX syntax validity
- Try different theme/size settings
- Use math-images skill directly for advanced options
"I don't understand the guidance"
- - Switch to Detailed Explanation mode
- Ask for more specific hints
- Request step-by-step walkthrough
Next Steps
- 1. Quick Demo? Upload a math problem image to test
- Configure Preferences? Choose solution style and theme
- Batch Processing? Upload multiple problem images
- Custom Styling? Specify color/size/format preferences
数学解题技能
一项全面的技能,用于从图像中解决数学问题,支持公式提取、渲染和引导式学习。
快速入门
用例1:图像转解答
用户:我有一张数学题照片,你能帮我解答吗?
流程:图像 → OCR → LaTeX提取 → 公式PNG → 引导提示 → 解答
用例2:LaTeX转PNG
用户:请美观地渲染这个LaTeX公式
流程:LaTeX代码 → math-images技能 → 自定义样式的PNG
用例3:批量处理问题
用户:从我的作业照片中提取所有公式
流程:多张图像 → 全部OCR → 提取LaTeX → 渲染PNG网格
核心工作流程
1. 图像识别与OCR
当用户上传数学问题图像时:
- - 使用PaddleOCR文档解析技能提取文本和公式
- 识别数学表达式、布局和问题结构
- 输出包含提取内容的结构化Markdown
支持的格式: JPG、PNG、BMP、TIFF、PDF
2. LaTeX提取与标准化
- - 将识别出的公式转换为正确的LaTeX语法
- 处理常见变体(分数、指数、下标、希腊字母)
- 在渲染前验证LaTeX语法
- 保留数学含义和格式
自动修正格式示例:
输入:a/b + c/d → LaTeX: \frac{a}{b} + \frac{c}{d}
输入:x^2 + 2x + 1 → LaTeX: x^{2} + 2x + 1
输入:sqrt(2) + pi → LaTeX: \sqrt{2} + \pi
输入:∑(i=1 to n) → LaTeX: \sum_{i=1}^{n}
3. 公式渲染为PNG
使用math-images技能将LaTeX转换为高质量PNG:
- - 支持多种主题(浅色、深色、棕褐色、粉笔)
- 可配置的文本大小和颜色
- 支持多个公式的批量渲染
- 针对不同用例优化DPI
输出文件命名: formula<问题编号><公式ID>.png
4. 引导式解答模式
模式A:苏格拉底式(探索性学习)
策略:提出引导性问题,不直接揭示答案
流程:
- 1. 问题分析 - 这个问题要求你找出什么?
- 概念检查 - 这里适用哪个数学概念?
- 方法提示 - 你会采用什么方法?
- 进度检查 - 展示你目前的解题过程。下一步是什么?
- 验证 - 你的答案合理吗?
适用场景: 学生希望学习,而不仅仅是获取答案
模式B:详细解释
策略:完整的逐步解答,包含推理过程
流程:
- 1. 问题理解 - 用清晰的术语重述问题
- 公式提取 - 识别相关的数学公式
- 逐步解答 - 每一步都包含LaTeX渲染和解释
- 验证 - 检查答案的有效性
- 替代方法 - 如适用,展示其他解题思路
适用场景: 学生希望全面理解
模式C:快速答案(最少指导)
仅提供答案和简要验证
配置选项
视觉主题
yaml
themes:
light:
background: white
text: black
accent: blue
dark:
background: #1e1e1e
text: white
accent: cyan
sepia:
background: #f4f1de
text: #2d2d2d
accent: #d4a574
chalk:
background: #2c2c2c
text: #e0e0e0
accent: #ffeb3b
渲染选项
yaml
formula_size: medium # small, medium, large
dpi: 300 # 150 (快速), 300 (高质量), 600 (打印)
background_transparent: false
include_explanation: true # 在公式下方显示解释
支持的数学领域
代数
- - 线性方程:ax + b = c
- 二次方程:ax^2 + bx + c = 0
- 多项式运算
- 有理表达式
- 方程组
几何
微积分
统计学
线性代数
输入与输出示例
示例1:简单分数问题
输入图像: 照片内容为化简:(1/2) + (1/3)
输出:
markdown
提取的问题
化简:(1/2) + (1/3)
LaTeX公式
\frac{1}{2} + \frac{1}{3}
渲染的公式
[PNG图像显示公式]
解题引导(苏格拉底模式)
- 1. 要相加分数,我们需要做什么?
→ 我们需要一个公分母
- 2. 2和3的最小公分母是多少?
→ 6
- 3. 将每个分数改写为分母为6的形式...
示例2:复杂表达式
输入: 手写二次方程图像
LaTeX提取:
latex
\frac{-b \pm \sqrt{b^2 - 4ac}}{2a}
主题渲染:
API集成
内部集成
- - PaddleOCR文档解析 - 图像转文本/公式提取
- math-images - LaTeX转PNG渲染
- Claude API - 问题分析与引导生成
外部支持
- - Mathpix API(可选) - 高精度公式识别
- Wolfram Alpha API(可选) - 答案验证
限制与边界情况
已知限制
- 1. 手写质量 - 对清晰、类似印刷体的手写效果最佳
- 复杂图表 - 可能难以处理嵌入的几何图形
- 不完整的问题 - 图像中需要完整的问题描述
- 多语言支持 - 针对英语/中文优化;其他语言可能效果不同
错误处理
- - 如果OCR置信度低于70%,要求用户重新拍摄更清晰的照片
- 如果LaTeX语法无效,建议手动修正
- 如果不支持的问题类型,建议替代方法
用户偏好设置
当用户指定偏好时,保存并一致应用:
yaml
user_preferences:
solution_mode: detailed # 或 socratic 或 quick
theme: dark
language: en # 或 zh 表示中文
include_diagrams: true
stepdetaillevel: medium # high, medium, low
故障排除
OCR未能识别公式
- - 请求更清晰的图像
- 尝试不同的角度/光照
- 如有LaTeX代码,手动粘贴
PNG渲染显示异常
- - 检查LaTeX语法有效性
- 尝试不同的主题/大小设置
- 直接使用math-images技能获取高级选项
我不理解引导内容
- - 切换到详细解释模式
- 请求更具体的提示
- 要求逐步讲解
后续步骤
- 1. 快速演示? 上传数学问题图像进行测试
- 配置偏好? 选择解答风格和主题
- 批量处理? 上传多个问题图像
- 自定义样式? 指定颜色/大小/格式偏好