Audio Script Writer
Overview
Content transformation tool that converts written medical and scientific materials into professionally structured audio scripts suitable for podcasts, educational videos, audiobooks, and voiceover narration.
Key Capabilities:
- - Format Conversion: Research papers → podcast scripts
- Spoken Word Optimization: Sentence restructuring for listening
- Pronunciation Guides: Medical terminology phonetic spelling
- Timing Estimation: Duration calculations for production planning
- Multi-Format Output: Podcast, video, lecture, audiobook templates
- Voice Direction: Tone, pace, and emphasis cues for narrators
When to Use
✅ Use this skill when:
- - Creating medical education podcasts from journal articles
- Converting conference presentations to video scripts
- Developing audiobook versions of medical textbooks
- Scripting patient education audio materials
- Producing research summary videos for social media
- Adapting written case reports for audio case studies
- Creating voiceover scripts for e-learning modules
❌ Do NOT use when:
- - Live presentation without script → Use improvisation
- Highly visual content (surgery videos) → Use visual-focused tools
- Interactive audio (Q&A format) → Use dialogue scripting tools
- Music or sound design planning → Use audio production software
- Voice recording itself → This creates scripts, not audio
Integration:
- - Upstream:
abstract-summarizer (content condensation), lay-summary-gen (patient-friendly language) - Downstream:
medical-translation (multi-language scripts), voice-cloning-tool (AI narration)
Core Capabilities
1. Spoken Word Transformation
Convert written text to conversational audio style:
CODEBLOCK0
Transformation Rules:
| Written Style | Audio Style | Example |
|---|
| "Furthermore" | "Plus" | Less formal transitions |
| " et al." |
"and their colleagues" | Expand abbreviations |
| Numbers in text | Spoken numbers | "15%" → "15 percent" |
| Long sentences | 15-20 word max | Break into digestible chunks |
| Passive voice | Active voice | "was observed" → "we saw" |
| Citations | Omit or footnote | "(Smith et al., 2024)" → [reference tone] |
2. Pronunciation Guide Generation
Create phonetic spelling for medical terms:
CODEBLOCK1
Guide Elements:
- - Phonetic Spelling: IPA or simplified phonetics
- Syllable Breaks: hy-per-ten-sion
- Emphasis Marking: Primary stress (CAPS), secondary stress
- Alternative Pronunciations: Regional variations (UK vs US)
- Sound-Alikes: "rhymes with..." for difficult terms
3. Timing and Pacing
Calculate speaking duration and mark pacing cues:
CODEBLOCK2
Speaking Rates:
| Style | WPM | Use Case |
|---|
| Slow/Educational | 120-130 | Patient education, complex topics |
| Conversational |
140-160 | Podcasts, general audience |
|
Fast/News | 170-190 | Time-constrained content |
|
Variable | Varies | Dynamic pacing with pauses |
Pacing Cues:
CODEBLOCK3
4. Multi-Format Templates
Generate scripts for different audio formats:
CODEBLOCK4
Format Types:
| Format | Characteristics | Best For |
|---|
| Podcast | Conversational, segments, ads | Long-form content, interviews |
| Video |
Visual cues, B-roll notes | YouTube, educational platforms |
|
Lecture | Structured, Q&A breaks | Online courses, training |
|
Audiobook | Chapter markers, consistent tone | Textbooks, memoirs |
|
News | Tight, factual, quick | Research briefs, updates |
Common Patterns
Pattern 1: Research Paper to Podcast
Scenario: Convert published study to 15-minute podcast episode.
CODEBLOCK5
Structure:
CODEBLOCK6
Pattern 2: Medical Lecture Recording
Scenario: Convert lecture notes to video script for online course.
CODEBLOCK7
Lecture Elements:
- - Learning objectives at start
- Periodic comprehension checks
- Break reminders
- Transition phrases between topics
- Summary and key takeaways
Pattern 3: Patient Education Audio
Scenario: Create audio guide for diabetes management.
CODEBLOCK8
Patient Script Features:
- - Simple language (avoid medical jargon)
- Empathetic tone
- Clear action steps
- Reassuring statements
- Repetition of key points
Pattern 4: Conference Presentation to Video
Scenario: Adapt live presentation to YouTube video format.
CODEBLOCK9
YouTube Optimization:
- - Hook in first 30 seconds
- Engagement questions for comments
- Call to action (subscribe, like)
- Timestamp markers for chapters
- B-roll suggestions for visual interest
Complete Workflow Example
From research paper to published podcast:
CODEBLOCK10
Python API:
CODEBLOCK11
Quality Checklist
Content Quality:
- - [ ] Written content accurate and current
- [ ] Sources cited (even if not spoken)
- [ ] Medical facts verified by expert
- [ ] Appropriate for target audience level
- [ ] No confidential patient information
Audio Optimization:
- - [ ] Sentences 15-20 words maximum
- [ ] Abbreviations expanded on first use
- [ ] Complex terms have pronunciation guides
- [ ] Active voice preferred over passive
- [ ] Transitions smooth and conversational
Production Quality:
- - [ ] Timing realistic for content density
- [ ] Pacing cues appropriate for subject
- [ ] Music/sound cues marked clearly
- [ ] Pronunciation guide comprehensive
- [ ] Script formatted for easy reading
Before Recording:
- - [ ] CRITICAL: Script read aloud for flow
- [ ] Difficult pronunciations practiced
- [ ] Timing tested with stopwatch
- [ ] Technical terms confirmed with subject expert
- [ ] Copyright cleared for any quoted material
Common Pitfalls
Content Issues:
- - ❌ Too dense → Information overload for listeners
- ✅ Break complex topics into multiple episodes
- - ❌ Visual dependencies → "As shown in Figure 3..."
- ✅ Describe visuals or omit visual-dependent content
- - ❌ Citation overload → Every sentence has reference
- ✅ Save citations for show notes, not narration
Audio Issues:
- - ❌ Written-style language → "Furthermore, the aforementioned..."
- ✅ Conversational: "Plus, this thing we talked about..."
- - ❌ No pauses → Relentless information delivery
- ✅ Build in breathing room; let points sink in
- - ❌ Ignoring pronunciation → Mispronounced medical terms
- ✅ Research and practice all technical terms
Production Issues:
- - ❌ Underestimating time → 10 minutes of script takes 12+ to record
- ✅ Add 20% buffer for retakes and natural pacing
- - ❌ Complex sentence structures → Tongue twisters for narrator
- ✅ Short sentences; avoid nested clauses
References
Available in references/ directory:
- -
audio_writing_best_practices.md - Broadcast writing guidelines - INLINECODE6 - Common terms phonetics
- INLINECODE7 - Industry format standards
- INLINECODE8 - Inclusive audio content
- INLINECODE9 - YouTube, Spotify, Apple specs
- INLINECODE10 - Narrator health and performance
Scripts
Located in scripts/ directory:
- -
main.py - CLI interface for script conversion - INLINECODE13 - Core text-to-audio transformation
- INLINECODE14 - Medical terminology phonetics
- INLINECODE15 - Duration calculation and pacing
- INLINECODE16 - Podcast, video, lecture templates
- INLINECODE17 - Narrator cues and direction
- INLINECODE18 - Alternative format generation
Limitations
- - Voice Performance: Script is text only; actual delivery varies by narrator
- Accent Variations: Pronunciation guides may not match all dialects
- Cultural Context: Humor and references may not translate across cultures
- Copyright: Cannot use copyrighted material without permission
- Technical Accuracy: Does not verify medical content (input-dependent)
- Live Elements: Cannot script unscripted interviews or Q&A
Parameters
| Parameter | Type | Default | Required | Description |
|---|
| INLINECODE19 , INLINECODE20 | string | - | No | Input text file path |
| INLINECODE21 , INLINECODE22 |
string | - | No | Output JSON file path (default: stdout) |
|
--text | string | - | No | Direct text input (alternative to --input) |
|
--duration,
-d | int | 5 | No | Target duration in minutes |
|
--pace,
-p | string | normal | No | Speaking pace (slow, normal, fast) |
|
--style,
-s | string | conversational | No | Script style (conversational, formal, educational) |
Usage
Basic Usage
CODEBLOCK12
Risk Assessment
| Risk Indicator | Assessment | Level |
|---|
| Code Execution | Python script executed locally | Low |
| Network Access |
No external API calls | Low |
| File System Access | Read input files, write output files | Low |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Output saved only to specified location | Low |
Security Checklist
- - [x] No hardcoded credentials or API keys
- [x] No unauthorized file system access
- [x] Output does not expose sensitive information
- [x] Prompt injection protections in place
- [x] Input validation for file paths
- [x] Output directory restricted to workspace
- [x] Script execution in sandboxed environment
Prerequisites
CODEBLOCK13
Evaluation Criteria
Success Metrics
- - [x] Successfully converts text to audio-optimized script
- [x] Expands abbreviations and converts numbers to words
- [x] Calculates estimated duration based on word count
- [x] Applies style-specific formatting
- [x] Provides pronunciation notes for medical terms
Test Cases
- 1. Basic Conversion: Convert text file → Returns audio script with metadata
- Abbreviation Handling: Text with "e.g., i.e., etc." → All expanded in output
- Number Conversion: Input with "1 in 4" → Output with "one in four"
Lifecycle Status
- - Current Stage: Draft
- Next Review Date: 2026-03-06
- Known Issues: None
- Planned Improvements:
- Add support for custom abbreviation dictionaries
- Integrate with text-to-speech engines
- Add multilingual support
🎙️ Pro Tip: The best audio scripts sound natural when spoken. Always read your script aloud before finalizing—if you stumble over a sentence, your narrator will too. Revise for the ear, not the eye.
音频脚本撰写器
概述
内容转换工具,可将书面医学和科学材料转化为专业结构的音频脚本,适用于播客、教育视频、有声读物和配音旁白。
核心能力:
- - 格式转换:研究论文 → 播客脚本
- 口语优化:为听力重构句子结构
- 发音指南:医学术语音标拼写
- 时长估算:为制作规划计算持续时间
- 多格式输出:播客、视频、讲座、有声读物模板
- 语音指导:为叙述者提供语调、节奏和强调提示
使用时机
✅ 使用此技能的场景:
- - 根据期刊文章制作医学教育播客
- 将会议演示转换为视频脚本
- 开发医学教科书的有声读物版本
- 编写患者教育音频材料脚本
- 为社交媒体制作研究摘要视频
- 将书面病例报告改编为音频案例研究
- 为电子学习模块创建配音脚本
❌ 请勿使用此技能的场景:
- - 无脚本的现场演示 → 使用即兴发挥
- 高度视觉化内容(手术视频) → 使用视觉聚焦工具
- 互动音频(问答形式) → 使用对话脚本工具
- 音乐或音效设计规划 → 使用音频制作软件
- 语音录制本身 → 此工具创建脚本,而非音频
集成:
- - 上游:摘要生成器(内容浓缩)、通俗摘要生成器(患者友好语言)
- 下游:医学翻译(多语言脚本)、语音克隆工具(AI旁白)
核心能力
1. 口语转换
将书面文本转换为对话式音频风格:
python
from scripts.audio_writer import AudioScriptWriter
writer = AudioScriptWriter()
转换书面内容
script = writer.convert
toaudio(
source
text=researchpaper,
format=podcast, # podcast, video, lecture, audiobook
target
audience=medicalstudents,
duration_minutes=15
)
print(script.spoken_text)
转换:The pathophysiology of diabetes mellitus involves...
为:So what exactly happens in diabetes? Well, it all starts when...
转换规则:
| 书面风格 | 音频风格 | 示例 |
|---|
| Furthermore | Plus | 减少正式过渡词 |
| et al. |
and their colleagues | 展开缩写 |
| 文本中的数字 | 口语数字 | 15% → 15 percent |
| 长句 | 最多15-20词 | 分解为易消化片段 |
| 被动语态 | 主动语态 | was observed → we saw |
| 引用 | 省略或脚注 | (Smith et al., 2024) → [参考提示音] |
2. 发音指南生成
为医学术语创建音标拼写:
python
生成发音指南
pronunciation = writer.create
pronunciationguide(
text=script,
include_phonetic=True,
include_syllables=True
)
输出:
Hyperlipidemia: hi-per-lip-i-DEE-mee-uh
Metformin: met-FOR-min
Atherosclerosis: ath-er-oh-skleh-ROH-sis
指南元素:
- - 音标拼写:国际音标或简化音标
- 音节划分:hy-per-ten-sion
- 重音标记:主重音(大写)、次重音
- 替代发音:地区变体(英式 vs 美式)
- 谐音提示:难词与...押韵
3. 时长与节奏
计算说话时长并标记节奏提示:
python
分析时长
timing = writer.calculate_timing(
script=script,
speaking_rate=conversational, # slow, conversational, fast
include_pauses=True
)
print(f预计时长:{timing.duration_minutes} 分钟)
print(f字数:{timing.word_count})
print(f语速:{timing.wordsperminute} 词/分钟)
语速:
| 风格 | 词/分钟 | 使用场景 |
|---|
| 慢速/教育型 | 120-130 | 患者教育、复杂主题 |
| 对话型 |
140-160 | 播客、普通受众 |
|
快速/新闻型 | 170-190 | 时间受限内容 |
|
可变型 | 变化 | 带停顿的动态节奏 |
节奏提示:
[呼吸] - 叙述者短暂停顿
[停顿2秒] - 两秒停顿以示强调
[放慢] - 关键点降低语速
[加快] - 增加能量/兴奋度
[节拍] - 戏剧性停顿
4. 多格式模板
为不同音频格式生成脚本:
python
播客剧集
podcast = writer.create
podcastscript(
content=article,
episode_format=interview, # solo, interview, panel
include
intromusic=True,
ad_breaks=[5, 12] # 分钟
)
教育视频
video = writer.create
videoscript(
content=lecture_slides,
visual_cues=True, # 标记视觉变化位置
b
rollnotes=True # 建议补充素材
)
格式类型:
| 格式 | 特点 | 最适合 |
|---|
| 播客 | 对话式、分段、广告 | 长内容、访谈 |
| 视频 |
视觉提示、补充素材备注 | YouTube、教育平台 |
|
讲座 | 结构化、问答环节 | 在线课程、培训 |
|
有声读物 | 章节标记、一致语调 | 教科书、回忆录 |
|
新闻 | 紧凑、事实性、快速 | 研究简报、更新 |
常见模式
模式1:研究论文转播客
场景:将已发表研究转换为15分钟播客剧集。
bash
将论文转换为播客脚本
python scripts/main.py \
--input paper.pdf \
--format podcast \
--duration 15 \
--style conversational \
--include-intro-outro \
--output podcast_script.txt
生成发音指南
python scripts/main.py \
--input podcast_script.txt \
--generate-pronunciation \
--output pronunciation_guide.txt
结构:
[开场音乐5秒]
主持人:欢迎收听《今日医学研究》。我是主持人...
[呼吸]
主持人:今天我们将深入探讨一项关于...的精彩研究
[停顿]
主持人:那么研究人员发现了什么?嗯...
[呼吸]
主持人:研究作者之一史密斯博士解释说...
[录音片段:采访剪辑]
...
[结束音乐]
模式2:医学讲座录制
场景:将讲座笔记转换为在线课程视频脚本。
python
创建讲座脚本
lecture = writer.create
lecturescript(
notes=lecture_content,
duration=45, # 分钟
break_intervals=[15, 30], # 学生休息时间(分钟)
interaction_points=True # 暂停思考...提示
)
添加视觉提示
script = writer.add
visualcues(
script=lecture,
slide_transitions=True,
animation_notes=True
)
讲座元素:
- - 开头列出学习目标
- 定期理解检查
- 休息提醒
- 主题间过渡短语
- 总结和关键要点
模式3:患者教育音频
场景:创建糖尿病管理音频指南。
python
患者友好脚本
patient
script = writer.createpatient_script(
medical
content=diabetesguide,
reading_level=6, # 六年级水平
empathetic_tone=True,
key
pointshighlighted=True
)
慢速、清晰节奏
patient
script.adjustpacing(
wpm=130,
pause
aftersentences=1.5 # 秒
)
患者脚本特点:
- - 简单语言(避免医学术语)
- 共情语调
- 清晰行动步骤
- 安抚性陈述
- 关键点重复
模式4:会议演示转视频
场景:将现场演示改编为YouTube视频格式。
bash
转换演示脚本
python scripts/main.py \
--input presentation_transcript.txt \
--format video \
--platform youtube \
--include-hooks true \
--engagement-cues true \
--output youtube_script.txt
YouTube优化:
- - 前30秒设置钩子
- 评论区的互动问题
- 行动号召(订阅、点赞)
- 章节时间戳标记
- 视觉兴趣的补充素材建议
完整工作流程示例
从研究论文到