Instagram Recipe Extractor

Extract recipes from Instagram reels using a multi-layered approach:

1. Caption parsing — Instant, check description first
Audio transcription — Whisper (local, no API key)
Frame analysis — Vision model for on-screen text

No Instagram login required. Works on public reels.

When to Use

- User sends an Instagram reel link
User mentions "recipe from Instagram" or "save this reel"
User wants to extract recipe details from a video post

How It Works (MANDATORY FLOW)

ALWAYS follow this complete flow — do not stop after caption if instructions are missing:

1. User sends Instagram reel URL
Extract metadata using yt-dlp (--dump-json)
Parse the caption for recipe details
Check completeness: Does caption have BOTH ingredients AND instructions?

- ✅ YES: Present the recipe - ❌ NO (missing instructions or incomplete): Automatically proceed to audio transcription — do NOT stop or ask the user

5. If audio transcription needed:

- Download video: yt-dlp -o "/tmp/reel.mp4" "URL" - Extract audio: ffmpeg -y -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav - Transcribe: whisper /tmp/reel.wav --model base --output_format txt --output_dir /tmp - Merge caption ingredients with audio instructions

6. Present clean, formatted recipe (combining caption + audio as needed)
User decides what to do (save to notes, add to wishlist, etc.)

Completeness check heuristics:

- Has ingredients = contains 3+ quantity+item patterns (e.g., "1 cup flour", "2 lbs chicken")
Has instructions = contains action verbs (blend, cook, bake, mix, pour, add) + sequence OR numbered steps

Extraction Command

CODEBLOCK0

Key fields from JSON output:

- description — The caption containing the recipe
INLINECODE5 — Creator's name
INLINECODE6 — Creator's handle
INLINECODE7 — Original URL
INLINECODE8 — Popularity indicator

Recipe Parsing

Look for these patterns in the caption:

Macros:

- "X Calories | Xg P | Xg C | Xg F"
"Macros per serving"
"Cal/Protein/Carbs/Fat"

Ingredients:

- Lines starting with quantities (1 cup, 2 tbsp, 24oz)
Lines with measurement units
Emoji bullet points (🥩 🌽 🧀 etc.)

Sections:

- "For the [component]:"
"Ingredients:"
"Instructions:"
"Directions:"

Output Format

Present extracted recipe cleanly:

CODEBLOCK1

User Actions After Extraction

Let the user decide what to do:

- "Save to my recipes" → Save to Apple Notes (if meal-planner skill available)
"Add to wishlist" → Save to INLINECODE9
"Just show me" → Display only, no save
"Plan this for next week" → Hand off to meal-planner skill

Wishlist Storage

Optional storage for recipes user wants to try later:

memory/recipe-wishlist.json:
CODEBLOCK2

Error Handling

If yt-dlp fails:

- Check if URL is valid Instagram reel format
May be a private account — inform user
Suggest user paste caption text manually as fallback

If no recipe found in caption (IMPORTANT):

After extracting, scan the caption for recipe indicators:

- Ingredient quantities (numbers + units like oz, cups, tbsp, lbs)
Recipe sections ("For the...", "Ingredients:", "Instructions:")
Cooking verbs (bake, cook, sauté, mix, combine)
Macro information (calories, protein, carbs, fat)

If none found, tell the user clearly:

"I pulled the caption but it doesn't look like the recipe is there — it might just be a teaser or the recipe is only shown in the video itself. Here's what the caption says:

[show caption]

A few options:

1. Check the comments — sometimes creators post recipes there
Check their bio link — might lead to the full recipe
Describe what you saw in the video and I can help find a similar recipe"

Recipe detection heuristics:
CODEBLOCK3

Integration with meal-planner

The meal-planner skill can reference this skill:

- When planning meals, check wishlist for untried recipes
Suggest wishlist recipes that match pantry items
Mark recipes as "tried" after they're used in a meal plan

Audio Transcription (V2) — MANDATORY FALLBACK

When caption is missing instructions, ALWAYS transcribe the audio automatically. Do not stop and ask the user — just do it. This is the most common case since creators often put ingredients in captions but speak the instructions.

Step 1: Download video
CODEBLOCK4

Step 2: Extract audio
CODEBLOCK5

Step 3: Transcribe with Whisper
CODEBLOCK6

Step 4: Parse transcript for recipe
Look for cooking instructions, ingredients mentioned verbally.

Inference for Missing Measurements

ALWAYS infer quantities when not provided. Never present a recipe without amounts — estimate based on context and standard package sizes.

Vague Language → Specific Amounts

What they say	Infer
"some chicken"	~1 lb
"a bit of garlic"

Standard Package Sizes (when item mentioned without amount)

Ingredient	Standard Package	Infer
Puff pastry	17oz sheet	1 sheet
Ground beef/turkey

Context-Aware Scaling

By recipe type:

- Stir fry for 2 → 1 lb protein, 4 cups veggies
Soup/stew → 1.5-2 lbs protein, 4 cups broth
Sheet pan meal → 1.5 lbs protein, 3-4 cups veggies
Appetizers → smaller portions, estimate ~12-15 pieces per batch

By servings mentioned:

- "Serves 4" → Scale standard amounts for 4
"Meal prep for the week" → Assume 5-8 servings
No servings mentioned → Default to 4 servings

By protein target (if user has macro goals):

- 40-50g protein per serving → ~6-8oz cooked meat per portion
Scale recipe protein accordingly

Output Format

Always present inferred amounts clearly:
CODEBLOCK7

Mark inferred quantities with (estimated) so user knows what came from the source vs inference.

Combined Extraction Flow

CODEBLOCK8

Frame Analysis

Extract key frames and analyze with vision model.

Extract frames:
CODEBLOCK9

Send to vision model:
Use Claude's image analysis to read each frame:

- Recipe cards / title screens
Ingredient lists shown on screen
Measurements in text overlays
Step-by-step instructions displayed

Vision prompt:
CODEBLOCK10

Merge strategy:

- Audio transcript = primary source (spoken instructions)
Frame analysis = supplement (exact measurements, recipe cards)
Combine both, prefer specific measurements from visual over inferred from audio

Pinned Comment Detection

Scan caption for these phrases (case-insensitive):

- "recipe pinned"
"pinned in comments"
"check comments"
"in the comments"
"comment below"
"recipe below"
"full recipe in comments"

If detected, flag and notify user after extraction:

"Heads up — the creator said the recipe is pinned in the comments.
I got what I could from the audio, but yt-dlp can't access pinned comments
without login. If you want the exact recipe, copy the pinned comment and
send it to me — I'll format it properly."

Requirements

- yt-dlp — INLINECODE11
INLINECODE12 — INLINECODE13
INLINECODE14 — pip3 install openai-whisper (runs locally, no API key)
No Instagram login required for public reels

Instagram 食谱提取器

使用多层方法从 Instagram Reels 中提取食谱：

1. 字幕解析 — 即时，先检查描述
音频转录 — Whisper（本地运行，无需 API 密钥）
画面分析 — 使用视觉模型识别屏幕文字

无需 Instagram 登录。适用于公开 Reels。

使用场景

- 用户发送 Instagram Reel 链接
用户提到来自 Instagram 的食谱或保存这个 Reel
用户想从视频帖子中提取食谱详情

工作原理（强制流程）

始终遵循以下完整流程 — 如果缺少说明，不要在字幕处停止：

1. 用户发送 Instagram Reel URL
使用 yt-dlp 提取元数据（--dump-json）
解析字幕中的食谱详情
检查完整性： 字幕是否同时包含食材和说明？

- ✅ 是：呈现食谱 - ❌ 否（缺少说明或不完整）： 自动进行音频转录 — 不要停止或询问用户

5. 如果需要音频转录：

- 下载视频：yt-dlp -o /tmp/reel.mp4 URL - 提取音频：ffmpeg -y -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav - 转录：whisper /tmp/reel.wav --model base --outputformat txt --outputdir /tmp - 合并字幕食材与音频说明

6. 呈现清晰、格式化的食谱（根据需要合并字幕和音频）
用户决定后续操作（保存到笔记、添加到心愿单等）

完整性检查启发式规则：

- 有食材 = 包含 3 个以上数量+项目模式（例如1 杯面粉、2 磅鸡肉）
有说明 = 包含动作动词（搅拌、烹饪、烘烤、混合、倒、加）+ 顺序或编号步骤

提取命令

bash
yt-dlp --dump-json https://www.instagram.com/reel/SHORTCODE/ 2>/dev/null

JSON 输出的关键字段：

- description — 包含食谱的字幕
uploader — 创作者名称
channel — 创作者账号
webpageurl — 原始 URL
likecount — 热度指标

食谱解析

在字幕中查找以下模式：

营养信息：

- X 卡路里 | Xg 蛋白质 | Xg 碳水 | Xg 脂肪
每份营养信息
热量/蛋白质/碳水/脂肪

食材：

- 以数量开头的行（1 杯、2 汤匙、24 盎司）
带计量单位的行
表情符号项目符号（🥩 🌽 🧀 等）

章节：

- [组件]部分：
食材：
说明：
做法：

输出格式

清晰呈现提取的食谱：

[食谱名称]

来自 @[账号]

营养信息（每份）： X 卡 | Xg 蛋白质 | Xg 碳水 | Xg 脂肪

食材

- [食材 1]
[食材 2]

...

说明

1. [步骤 1]
[步骤 2]

...

来源：[原始 URL]

提取后的用户操作

让用户决定后续操作：

- 保存到我的食谱 → 保存到 Apple 备忘录（如果有膳食计划技能）
添加到心愿单 → 保存到 memory/recipe-wishlist.json
只展示给我看 → 仅显示，不保存
安排到下周 → 转交给膳食计划技能

心愿单存储

用户想稍后尝试的食谱的可选存储：

memory/recipe-wishlist.json：
json
{
recipes: [
{
name: 食谱名称,
source: instagram,
sourceUrl: https://instagram.com/reel/...,
handle: @创作者,
addedDate: 2026-01-26,
tried: false,
macros: {
calories: 585,
protein: 56,
carbs: 25,
fat: 28,
servings: 3
},
ingredients: [...],
instructions: [...]
}
]
}

错误处理

如果 yt-dlp 失败：

- 检查 URL 是否为有效的 Instagram Reel 格式
可能是私密账户 — 告知用户
建议用户手动粘贴字幕文本作为备选方案

如果在字幕中未找到食谱（重要）：

提取后，扫描字幕中的食谱指示：

- 食材数量（数字 + 单位，如盎司、杯、汤匙、磅）
食谱章节（部分...、食材：、说明：）
烹饪动词（烘烤、烹饪、煎炒、混合、组合）
营养信息（卡路里、蛋白质、碳水、脂肪）

如果未找到，清晰告知用户：

我提取了字幕，但看起来食谱不在里面 — 可能只是预告，或者食谱只在视频中显示。以下是字幕内容：

[显示字幕]

几个选择：

1. 查看评论 — 有时创作者会在那里发布食谱
查看他们的个人简介链接 — 可能指向完整食谱
描述你在视频中看到的内容，我可以帮助找到类似的食谱

食谱检测启发式规则：

有食谱如果字幕包含：

- 3 个以上类似食材的模式（数量 + 食物项目）
或食谱 + 食材列表
或营养信息 + 食材
或编号/项目符号说明

无食谱如果字幕是：

- 主要是标签
只是描述/预告
少于 100 个字符
没有数量或计量

与膳食计划集成

膳食计划技能可以引用此技能：

- 计划餐食时，检查心愿单中未尝试的食谱
建议与食品储藏室物品匹配的心愿单食谱
在餐食计划中使用后将食谱标记为已尝试

音频转录（V2）— 强制备选方案

当字幕缺少说明时，始终自动转录音频。 不要停下来询问用户 — 直接执行。这是最常见的情况，因为创作者通常将食材放在字幕中，但口头说明步骤。

步骤 1：下载视频
bash
yt-dlp -o /tmp/reel.mp4 https://instagram.com/reel/XXX

步骤 2：提取音频
bash
ffmpeg -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav

步骤 3：使用 Whisper 转录
bash
/Users/kylekirkland/Library/Python/3.14/bin/whisper /tmp/reel.wav --model base --outputformat txt --outputdir /tmp

步骤 4：解析转录文本中的食谱
查找口头提到的烹饪说明和食材。

缺失计量的推断

当未提供数量时，始终推断。 永远不要呈现没有用量的食谱 — 根据上下文和标准包装尺寸进行估算。

模糊语言 → 具体数量

他们说的	推断
一些鸡肉	~1 磅
一点大蒜

2-3 瓣 | | 一把菠菜 | ~2 杯 | | 少许油 | 1-2 汤匙 | | 适量调味 | ½ 茶匙盐、¼ 茶匙胡椒 | | 少许酱油 | 1-2 汤匙 | | 几汤匙 | 2-3 汤匙 | | 一些米饭 | 1 杯干米 | | 上面放奶酪 | ½ - 1 杯切丝 | | 切碎的洋葱 | 1 个中等洋葱 | | 甜椒 | 2 个甜椒 |

标准包装尺寸（当提到食材但没有数量时）

食材	标准包装	推断
酥皮	17 盎司片	1 片
碎牛肉/火鸡肉

1 磅包装 | 1 磅 | | 鸡胸肉 | ~1.5 磅包装 | 1.5 磅 | | 香肠 | 14 盎司 / 4-5 根 | 1 包 | | 培根 | 12 盎司 / 12 片 | ½ 包（6 片）| | 切丝奶酪 | 8 盎司袋 | 1-2

clawdbites提取食谱

clawdbites

Instagram Recipe Extractor

When to Use

How It Works (MANDATORY FLOW)

Extraction Command

Recipe Parsing

Output Format

User Actions After Extraction

Wishlist Storage

Error Handling

Integration with meal-planner

Audio Transcription (V2) — MANDATORY FALLBACK

Inference for Missing Measurements

Vague Language → Specific Amounts

Standard Package Sizes (when item mentioned without amount)

Context-Aware Scaling

Output Format

Combined Extraction Flow

Frame Analysis

Pinned Comment Detection

Requirements

Instagram 食谱提取器

使用场景

工作原理（强制流程）

提取命令

食谱解析

输出格式

[食谱名称]

食材

说明

提取后的用户操作

心愿单存储

错误处理

与膳食计划集成

音频转录（V2）— 强制备选方案

缺失计量的推断

模糊语言 → 具体数量

标准包装尺寸（当提到食材但没有数量时）

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement