DJ Set Ripper
Extract tracklists from DJ sets and download each track individually.
⚠️ Legal Notice: This skill is intended for downloading music you have the right to access — purchases, free releases, creative commons, etc. Respect copyright laws in your jurisdiction. The author is not responsible for misuse.
Dependencies
Same as dj-mp3-sourcer (yt-dlp, ffmpeg/ffprobe, spotdl). No additional dependencies.
Workflow
1. Extract Page Content
Fetch the set URL and extract raw text (description, metadata, comments):
YouTube:
CODEBLOCK0
SoundCloud / Mixcloud:
Use web_fetch to grab the page content in markdown mode.
1001Tracklists:
Use web_fetch — this source has the most structured data. Prefer it when available.
2. Parse the Tracklist (LLM-Powered)
Feed the raw page content to the model with this prompt structure:
CODEBLOCK1
If parsing returns zero tracks, inform the user the tracklist couldn't be extracted and suggest:
- - Checking 1001Tracklists manually
- Pasting the tracklist directly
3. Download Each Track
For each parsed track (skipping any with artist AND title = "ID"):
- 1. Use the dj-mp3-sourcer workflow: search sources in priority order, prefer extended mixes, download or surface purchase links
- Use
sessions_spawn to parallelize downloads (batch of 3-5 at a time to avoid rate limits) - Save files to: INLINECODE3
Set name is derived from the mix title (sanitized for filesystem).
4. Optionally Download the Full Mix
Ask the user if they also want the full mix downloaded. If yes:
CODEBLOCK2
5. Normalize Filenames
After all downloads complete (not per-batch — wait for every sub-agent to finish), run the normalization script once:
CODEBLOCK3
This fuzzy-matches each mp3 to a tracklist entry and renames to clean Artist - Title.mp3. Handles NA - prefixes, (Official Video) junk, wrong artist credits, label names, etc.
Critical: Run this in the parent agent after all batches return — do NOT rely on sub-agents to rename. The parsed tracklist is the source of truth for filenames.
6. Generate the Log File
Create ~/Downloads/{set-name}/{timestamp}.log with format:
CODEBLOCK4
Edge Cases
- - No tracklist in description — check 1001Tracklists via web_search: INLINECODE8
- "ID - ID" tracks — log as unidentified, don't attempt download
- Bootlegs / mashups — search anyway, but expect failures. log as
not found with note - B2B sets — multiple artists in set title, handle gracefully
- Duplicate tracks — deduplicate by artist+title before downloading
- Very long sets (50+ tracks) — batch in groups of 5, report progress as batches complete
Configuration
| Setting | Default | Notes |
|---|
| Output directory | INLINECODE10 | Per-set subfolder |
| Format |
mp3 320k | Via dj-mp3-sourcer |
| Download full mix | ask user | Can be set to always/never |
| Free only mode | true | Passed through to dj-mp3-sourcer (skip paid sources, use spotdl/yt-dlp only) |
| Parallel downloads | 5 | Max concurrent track downloads |
DJ Set Ripper
从DJ混音集中提取曲目列表并单独下载每首曲目。
⚠️ 法律声明: 此技能仅用于下载您有权访问的音乐——包括购买、免费发布、创作共用许可等。请遵守您所在司法管辖区的版权法律。作者不对滥用行为负责。
依赖项
与 dj-mp3-sourcer 相同(yt-dlp、ffmpeg/ffprobe、spotdl)。无额外依赖项。
工作流程
1. 提取页面内容
获取混音集URL并提取原始文本(描述、元数据、评论):
YouTube:
bash
yt-dlp --dump-json | jq -r .description
SoundCloud / Mixcloud:
使用 web_fetch 以markdown模式获取页面内容。
1001Tracklists:
使用 web_fetch —— 此来源拥有最结构化的数据。如有可用,优先使用。
2. 解析曲目列表(LLM驱动)
将原始页面内容输入模型,使用以下提示结构:
从该DJ混音集描述中提取所有曲目。返回一个JSON对象数组:
[{number: 1, timestamp: 0:00, artist: 艺术家名称, title: 曲目标题(混音名称)}]
规则:
- - 在标题中保留混音/重混名称(例如Original Mix、Extended Mix、Remix)
- 如果曲目标记为ID - ID或ID,则将艺术家和标题均设为ID
- 如果仅有时间戳而无曲目信息,则跳过
- 标准化艺术家名称(修正全大写等)
- 如果无时间戳,则将时间戳设为null
- 从1开始顺序编号曲目
原始内容:
{description_text}
如果解析返回零首曲目,告知用户无法提取曲目列表,并建议:
- - 手动查看1001Tracklists
- 直接粘贴曲目列表
3. 下载每首曲目
对于每首解析出的曲目(跳过艺术家和标题均为ID的曲目):
- 1. 使用 dj-mp3-sourcer 工作流程:按优先级顺序搜索来源,优先选择加长混音版,下载或显示购买链接
- 使用 sessionsspawn 并行下载(每次批量处理3-5首以避免速率限制)
- 保存文件至:~/Downloads/{set-name}/
混音集名称源自混音标题(已进行文件系统安全处理)。
4. 可选下载完整混音集
询问用户是否也想下载完整混音集。如果是:
bash
yt-dlp -x --audio-format mp3 --audio-quality 0 \
--embed-thumbnail --add-metadata \
-o ~/Downloads/{set-name}/{set-name} [Full Mix].%(ext)s
5. 标准化文件名
在所有下载完成后(非每批次——等待所有子代理完成),运行一次标准化脚本:
bash
1. 将解析的曲目列表写入JSON
cat > /tmp/tracklist.json << EOF
[{artist: Artist, title: Title}, ...]
EOF
2. 运行标准化
scripts/normalize-filenames.sh ~/Downloads/{set-name} /tmp/tracklist.json
此脚本将每个mp3与曲目列表条目进行模糊匹配,并重命名为整洁的Artist - Title.mp3。处理NA -前缀、(Official Video)垃圾信息、错误的艺术家署名、厂牌名称等。
关键: 在所有批次返回后在父代理中运行此操作——不要依赖子代理进行重命名。解析后的曲目列表是文件名的唯一真实来源。
6. 生成日志文件
创建 ~/Downloads/{set-name}/{timestamp}.log,格式如下:
DJ Set Ripper 日志
=================
混音集:{set title}
URL:{original url}
日期:{ISO timestamp}
找到曲目:{total}
| 艺术家 | 标题 | 状态 | 来源 | 比特率 | 大小 | 文件/链接
----|---------------------|--------------------------------|----------------|----------|---------|-------|----------
01 | Argy | Aria (Original Mix) | ✅ 已下载 | spotdl | 320k | 8.2MB | Argy - Aria (Original Mix).mp3
02 | ID | ID | ⬛ 未识别 | — | — | — | —
03 | Massano | Odyssey | ✅ 已下载 | youtube | 271k | 6.5MB | Massano - Odyssey.mp3
04 | Boris Brejcha | Gravity (Extended Mix) | 🛒 购买 | beatport | — | — | https://...
05 | Some Bootleg | Unreleased VIP | ❌ 未找到 | — | — | — | —
摘要:3首已下载,1个购买链接,1首未找到,1首未识别
总大小:~XXM(单曲)+ XXM(完整混音集)
完整混音集:✅ 已下载 → {set-name} [Full Mix].mp3
备注:
- - 比特率通过 ffprobe -v quiet -showentries format=bitrate -of csv=p=0
- 文件大小通过 ls -lh
边界情况
- - 描述中无曲目列表 —— 通过web_search检查1001Tracklists:{set title} site:1001tracklists.com
- ID - ID曲目 —— 记录为未识别,不尝试下载
- Bootlegs / 混搭曲 —— 仍尝试搜索,但预期会失败。记录为未找到并附注
- B2B混音集 —— 混音集标题中有多位艺术家,优雅处理
- 重复曲目 —— 下载前按艺术家+标题去重
- 超长混音集(50+首曲目) —— 以5首为一组分批处理,每批完成后报告进度
配置
| 设置 | 默认值 | 备注 |
|---|
| 输出目录 | ~/Downloads/{set-name}/ | 每混音集子文件夹 |
| 格式 |
mp3 320k | 通过dj-mp3-sourcer |
| 下载完整混音集 | 询问用户 | 可设为始终/从不 |
| 仅免费模式 | true | 传递给dj-mp3-sourcer(跳过付费来源,仅使用spotdl/yt-dlp) |
| 并行下载数 | 5 | 最大并发曲目下载数 |