Instagram Recipe Extractor
Extract recipes from Instagram reels using a multi-layered approach:
- 1. Caption parsing — Instant, check description first
- Audio transcription — Whisper (local, no API key)
- Frame analysis — Vision model for on-screen text
No Instagram login required. Works on public reels.
When to Use
- - User sends an Instagram reel link
- User mentions "recipe from Instagram" or "save this reel"
- User wants to extract recipe details from a video post
How It Works (MANDATORY FLOW)
ALWAYS follow this complete flow — do not stop after caption if instructions are missing:
- 1. User sends Instagram reel URL
- Extract metadata using yt-dlp (
--dump-json) - Parse the caption for recipe details
- Check completeness: Does caption have BOTH ingredients AND instructions?
- ✅
YES: Present the recipe
- ❌
NO (missing instructions or incomplete): Automatically proceed to audio transcription — do NOT stop or ask the user
- 5. If audio transcription needed:
- Download video:
yt-dlp -o "/tmp/reel.mp4" "URL"
- Extract audio:
ffmpeg -y -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav
- Transcribe:
whisper /tmp/reel.wav --model base --output_format txt --output_dir /tmp
- Merge caption ingredients with audio instructions
- 6. Present clean, formatted recipe (combining caption + audio as needed)
- User decides what to do (save to notes, add to wishlist, etc.)
Completeness check heuristics:
- - Has ingredients = contains 3+ quantity+item patterns (e.g., "1 cup flour", "2 lbs chicken")
- Has instructions = contains action verbs (blend, cook, bake, mix, pour, add) + sequence OR numbered steps
Extraction Command
CODEBLOCK0
Key fields from JSON output:
- -
description — The caption containing the recipe - INLINECODE5 — Creator's name
- INLINECODE6 — Creator's handle
- INLINECODE7 — Original URL
- INLINECODE8 — Popularity indicator
Recipe Parsing
Look for these patterns in the caption:
Macros:
- - "X Calories | Xg P | Xg C | Xg F"
- "Macros per serving"
- "Cal/Protein/Carbs/Fat"
Ingredients:
- - Lines starting with quantities (1 cup, 2 tbsp, 24oz)
- Lines with measurement units
- Emoji bullet points (🥩 🌽 🧀 etc.)
Sections:
- - "For the [component]:"
- "Ingredients:"
- "Instructions:"
- "Directions:"
Output Format
Present extracted recipe cleanly:
CODEBLOCK1
User Actions After Extraction
Let the user decide what to do:
- - "Save to my recipes" → Save to Apple Notes (if meal-planner skill available)
- "Add to wishlist" → Save to INLINECODE9
- "Just show me" → Display only, no save
- "Plan this for next week" → Hand off to meal-planner skill
Wishlist Storage
Optional storage for recipes user wants to try later:
memory/recipe-wishlist.json:
CODEBLOCK2
Error Handling
If yt-dlp fails:
- - Check if URL is valid Instagram reel format
- May be a private account — inform user
- Suggest user paste caption text manually as fallback
If no recipe found in caption (IMPORTANT):
After extracting, scan the caption for recipe indicators:
- - Ingredient quantities (numbers + units like oz, cups, tbsp, lbs)
- Recipe sections ("For the...", "Ingredients:", "Instructions:")
- Cooking verbs (bake, cook, sauté, mix, combine)
- Macro information (calories, protein, carbs, fat)
If none found, tell the user clearly:
"I pulled the caption but it doesn't look like the recipe is there — it might just be a teaser or the recipe is only shown in the video itself. Here's what the caption says:
[show caption]
A few options:
- 1. Check the comments — sometimes creators post recipes there
- Check their bio link — might lead to the full recipe
- Describe what you saw in the video and I can help find a similar recipe"
Recipe detection heuristics:
CODEBLOCK3
Integration with meal-planner
The meal-planner skill can reference this skill:
- - When planning meals, check wishlist for untried recipes
- Suggest wishlist recipes that match pantry items
- Mark recipes as "tried" after they're used in a meal plan
Audio Transcription (V2) — MANDATORY FALLBACK
When caption is missing instructions, ALWAYS transcribe the audio automatically. Do not stop and ask the user — just do it. This is the most common case since creators often put ingredients in captions but speak the instructions.
Step 1: Download video
CODEBLOCK4
Step 2: Extract audio
CODEBLOCK5
Step 3: Transcribe with Whisper
CODEBLOCK6
Step 4: Parse transcript for recipe
Look for cooking instructions, ingredients mentioned verbally.
Inference for Missing Measurements
ALWAYS infer quantities when not provided. Never present a recipe without amounts — estimate based on context and standard package sizes.
Vague Language → Specific Amounts
| What they say | Infer |
|---|
| "some chicken" | ~1 lb |
| "a bit of garlic" |
2-3 cloves |
| "handful of spinach" | ~2 cups |
| "drizzle of oil" | 1-2 tbsp |
| "season to taste" | ½ tsp salt, ¼ tsp pepper |
| "splash of soy sauce" | 1-2 tbsp |
| "a few tablespoons" | 2-3 tbsp |
| "some rice" | 1 cup dry |
| "cheese on top" | ½ - 1 cup shredded |
| "diced onion" | 1 medium onion |
| "bell peppers" | 2 peppers |
Standard Package Sizes (when item mentioned without amount)
| Ingredient | Standard Package | Infer |
|---|
| Puff pastry | 17oz sheet | 1 sheet |
| Ground beef/turkey |
1 lb pack | 1 lb |
| Chicken breast | ~1.5 lb pack | 1.5 lbs |
| Sausage links | 14oz / 4-5 links | 1 package |
| Bacon | 12oz / 12 slices | ½ package (6 slices) |
| Shredded cheese | 8oz bag | 1-2 cups |
| Tortillas | 8-10 count | 1 package |
| Canned beans | 15oz can | 1 can |
| Broth/stock | 32oz carton | 1-2 cups |
| Pasta | 16oz box | 8oz (half box) |
| Rice | 2 lb bag | 1-2 cups dry |
Context-Aware Scaling
By recipe type:
- - Stir fry for 2 → 1 lb protein, 4 cups veggies
- Soup/stew → 1.5-2 lbs protein, 4 cups broth
- Sheet pan meal → 1.5 lbs protein, 3-4 cups veggies
- Appetizers → smaller portions, estimate ~12-15 pieces per batch
By servings mentioned:
- - "Serves 4" → Scale standard amounts for 4
- "Meal prep for the week" → Assume 5-8 servings
- No servings mentioned → Default to 4 servings
By protein target (if user has macro goals):
- - 40-50g protein per serving → ~6-8oz cooked meat per portion
- Scale recipe protein accordingly
Output Format
Always present inferred amounts clearly:
CODEBLOCK7
Mark inferred quantities with (estimated) so user knows what came from the source vs inference.
Combined Extraction Flow
CODEBLOCK8
Frame Analysis
Extract key frames and analyze with vision model.
Extract frames:
CODEBLOCK9
Send to vision model:
Use Claude's image analysis to read each frame:
- - Recipe cards / title screens
- Ingredient lists shown on screen
- Measurements in text overlays
- Step-by-step instructions displayed
Vision prompt:
CODEBLOCK10
Merge strategy:
- - Audio transcript = primary source (spoken instructions)
- Frame analysis = supplement (exact measurements, recipe cards)
- Combine both, prefer specific measurements from visual over inferred from audio
Pinned Comment Detection
Scan caption for these phrases (case-insensitive):
- - "recipe pinned"
- "pinned in comments"
- "check comments"
- "in the comments"
- "comment below"
- "recipe below"
- "full recipe in comments"
If detected, flag and notify user after extraction:
"Heads up — the creator said the recipe is pinned in the comments.
I got what I could from the audio, but yt-dlp can't access pinned comments
without login. If you want the exact recipe, copy the pinned comment and
send it to me — I'll format it properly."
Requirements
- -
yt-dlp — INLINECODE11 - INLINECODE12 — INLINECODE13
- INLINECODE14 —
pip3 install openai-whisper (runs locally, no API key) - No Instagram login required for public reels
Instagram 食谱提取器
使用多层方法从 Instagram Reels 中提取食谱:
- 1. 字幕解析 — 即时,先检查描述
- 音频转录 — Whisper(本地运行,无需 API 密钥)
- 画面分析 — 使用视觉模型识别屏幕文字
无需 Instagram 登录。适用于公开 Reels。
使用场景
- - 用户发送 Instagram Reel 链接
- 用户提到来自 Instagram 的食谱或保存这个 Reel
- 用户想从视频帖子中提取食谱详情
工作原理(强制流程)
始终遵循以下完整流程 — 如果缺少说明,不要在字幕处停止:
- 1. 用户发送 Instagram Reel URL
- 使用 yt-dlp 提取元数据(--dump-json)
- 解析字幕中的食谱详情
- 检查完整性: 字幕是否同时包含食材和说明?
- ✅
是: 呈现食谱
- ❌
否(缺少说明或不完整): 自动进行音频转录 — 不要停止或询问用户
- 5. 如果需要音频转录:
- 下载视频:yt-dlp -o /tmp/reel.mp4 URL
- 提取音频:ffmpeg -y -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav
- 转录:whisper /tmp/reel.wav --model base --output
format txt --outputdir /tmp
- 合并字幕食材与音频说明
- 6. 呈现清晰、格式化的食谱(根据需要合并字幕和音频)
- 用户决定后续操作(保存到笔记、添加到心愿单等)
完整性检查启发式规则:
- - 有食材 = 包含 3 个以上数量+项目模式(例如1 杯面粉、2 磅鸡肉)
- 有说明 = 包含动作动词(搅拌、烹饪、烘烤、混合、倒、加)+ 顺序或编号步骤
提取命令
bash
yt-dlp --dump-json https://www.instagram.com/reel/SHORTCODE/ 2>/dev/null
JSON 输出的关键字段:
- - description — 包含食谱的字幕
- uploader — 创作者名称
- channel — 创作者账号
- webpageurl — 原始 URL
- likecount — 热度指标
食谱解析
在字幕中查找以下模式:
营养信息:
- - X 卡路里 | Xg 蛋白质 | Xg 碳水 | Xg 脂肪
- 每份营养信息
- 热量/蛋白质/碳水/脂肪
食材:
- - 以数量开头的行(1 杯、2 汤匙、24 盎司)
- 带计量单位的行
- 表情符号项目符号(🥩 🌽 🧀 等)
章节:
输出格式
清晰呈现提取的食谱:
[食谱名称]
来自 @[账号]
营养信息(每份): X 卡 | Xg 蛋白质 | Xg 碳水 | Xg 脂肪
食材
...
说明
- 1. [步骤 1]
- [步骤 2]
...
来源:[原始 URL]
提取后的用户操作
让用户决定后续操作:
- - 保存到我的食谱 → 保存到 Apple 备忘录(如果有膳食计划技能)
- 添加到心愿单 → 保存到 memory/recipe-wishlist.json
- 只展示给我看 → 仅显示,不保存
- 安排到下周 → 转交给膳食计划技能
心愿单存储
用户想稍后尝试的食谱的可选存储:
memory/recipe-wishlist.json:
json
{
recipes: [
{
name: 食谱名称,
source: instagram,
sourceUrl: https://instagram.com/reel/...,
handle: @创作者,
addedDate: 2026-01-26,
tried: false,
macros: {
calories: 585,
protein: 56,
carbs: 25,
fat: 28,
servings: 3
},
ingredients: [...],
instructions: [...]
}
]
}
错误处理
如果 yt-dlp 失败:
- - 检查 URL 是否为有效的 Instagram Reel 格式
- 可能是私密账户 — 告知用户
- 建议用户手动粘贴字幕文本作为备选方案
如果在字幕中未找到食谱(重要):
提取后,扫描字幕中的食谱指示:
- - 食材数量(数字 + 单位,如盎司、杯、汤匙、磅)
- 食谱章节(部分...、食材:、说明:)
- 烹饪动词(烘烤、烹饪、煎炒、混合、组合)
- 营养信息(卡路里、蛋白质、碳水、脂肪)
如果未找到,清晰告知用户:
我提取了字幕,但看起来食谱不在里面 — 可能只是预告,或者食谱只在视频中显示。以下是字幕内容:
[显示字幕]
几个选择:
- 1. 查看评论 — 有时创作者会在那里发布食谱
- 查看他们的个人简介链接 — 可能指向完整食谱
- 描述你在视频中看到的内容,我可以帮助找到类似的食谱
食谱检测启发式规则:
有食谱 如果字幕包含:
- - 3 个以上类似食材的模式(数量 + 食物项目)
- 或食谱 + 食材列表
- 或营养信息 + 食材
- 或编号/项目符号说明
无食谱 如果字幕是:
- - 主要是标签
- 只是描述/预告
- 少于 100 个字符
- 没有数量或计量
与膳食计划集成
膳食计划技能可以引用此技能:
- - 计划餐食时,检查心愿单中未尝试的食谱
- 建议与食品储藏室物品匹配的心愿单食谱
- 在餐食计划中使用后将食谱标记为已尝试
音频转录(V2)— 强制备选方案
当字幕缺少说明时,始终自动转录音频。 不要停下来询问用户 — 直接执行。这是最常见的情况,因为创作者通常将食材放在字幕中,但口头说明步骤。
步骤 1:下载视频
bash
yt-dlp -o /tmp/reel.mp4 https://instagram.com/reel/XXX
步骤 2:提取音频
bash
ffmpeg -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav
步骤 3:使用 Whisper 转录
bash
/Users/kylekirkland/Library/Python/3.14/bin/whisper /tmp/reel.wav --model base --outputformat txt --outputdir /tmp
步骤 4:解析转录文本中的食谱
查找口头提到的烹饪说明和食材。
缺失计量的推断
当未提供数量时,始终推断。 永远不要呈现没有用量的食谱 — 根据上下文和标准包装尺寸进行估算。
模糊语言 → 具体数量
2-3 瓣 |
| 一把菠菜 | ~2 杯 |
| 少许油 | 1-2 汤匙 |
| 适量调味 | ½ 茶匙盐、¼ 茶匙胡椒 |
| 少许酱油 | 1-2 汤匙 |
| 几汤匙 | 2-3 汤匙 |
| 一些米饭 | 1 杯干米 |
| 上面放奶酪 | ½ - 1 杯切丝 |
| 切碎的洋葱 | 1 个中等洋葱 |
| 甜椒 | 2 个甜椒 |
标准包装尺寸(当提到食材但没有数量时)
| 食材 | 标准包装 | 推断 |
|---|
| 酥皮 | 17 盎司片 | 1 片 |
| 碎牛肉/火鸡肉 |
1 磅包装 | 1 磅 |
| 鸡胸肉 | ~1.5 磅包装 | 1.5 磅 |
| 香肠 | 14 盎司 / 4-5 根 | 1 包 |
| 培根 | 12 盎司 / 12 片 | ½ 包(6 片)|
| 切丝奶酪 | 8 盎司袋 | 1-2