Audio Script Writer

Overview

Content transformation tool that converts written medical and scientific materials into professionally structured audio scripts suitable for podcasts, educational videos, audiobooks, and voiceover narration.

Key Capabilities:

- Format Conversion: Research papers → podcast scripts
Spoken Word Optimization: Sentence restructuring for listening
Pronunciation Guides: Medical terminology phonetic spelling
Timing Estimation: Duration calculations for production planning
Multi-Format Output: Podcast, video, lecture, audiobook templates
Voice Direction: Tone, pace, and emphasis cues for narrators

When to Use

✅ Use this skill when:

- Creating medical education podcasts from journal articles
Converting conference presentations to video scripts
Developing audiobook versions of medical textbooks
Scripting patient education audio materials
Producing research summary videos for social media
Adapting written case reports for audio case studies
Creating voiceover scripts for e-learning modules

❌ Do NOT use when:

- Live presentation without script → Use improvisation
Highly visual content (surgery videos) → Use visual-focused tools
Interactive audio (Q&A format) → Use dialogue scripting tools
Music or sound design planning → Use audio production software
Voice recording itself → This creates scripts, not audio

Integration:

- Upstream: abstract-summarizer (content condensation), lay-summary-gen (patient-friendly language)
Downstream: medical-translation (multi-language scripts), voice-cloning-tool (AI narration)

Core Capabilities

1. Spoken Word Transformation

Convert written text to conversational audio style:

CODEBLOCK0

Transformation Rules:

Written Style	Audio Style	Example
"Furthermore"	"Plus"	Less formal transitions
" et al."

2. Pronunciation Guide Generation

Create phonetic spelling for medical terms:

CODEBLOCK1

Guide Elements:

- Phonetic Spelling: IPA or simplified phonetics
Syllable Breaks: hy-per-ten-sion
Emphasis Marking: Primary stress (CAPS), secondary stress
Alternative Pronunciations: Regional variations (UK vs US)
Sound-Alikes: "rhymes with..." for difficult terms

3. Timing and Pacing

Calculate speaking duration and mark pacing cues:

CODEBLOCK2

Speaking Rates:

Style	WPM	Use Case
Slow/Educational	120-130	Patient education, complex topics
Conversational

Pacing Cues:
CODEBLOCK3

4. Multi-Format Templates

Generate scripts for different audio formats:

CODEBLOCK4

Format Types:

Format	Characteristics	Best For
Podcast	Conversational, segments, ads	Long-form content, interviews
Video

Common Patterns

Pattern 1: Research Paper to Podcast

Scenario: Convert published study to 15-minute podcast episode.

CODEBLOCK5

Structure:
CODEBLOCK6

Pattern 2: Medical Lecture Recording

Scenario: Convert lecture notes to video script for online course.

CODEBLOCK7

Lecture Elements:

- Learning objectives at start
Periodic comprehension checks
Break reminders
Transition phrases between topics
Summary and key takeaways

Pattern 3: Patient Education Audio

Scenario: Create audio guide for diabetes management.

CODEBLOCK8

Patient Script Features:

- Simple language (avoid medical jargon)
Empathetic tone
Clear action steps
Reassuring statements
Repetition of key points

Pattern 4: Conference Presentation to Video

Scenario: Adapt live presentation to YouTube video format.

CODEBLOCK9

YouTube Optimization:

- Hook in first 30 seconds
Engagement questions for comments
Call to action (subscribe, like)
Timestamp markers for chapters
B-roll suggestions for visual interest

Complete Workflow Example

From research paper to published podcast:

CODEBLOCK10

Python API:

CODEBLOCK11

Quality Checklist

Content Quality:

- [ ] Written content accurate and current
[ ] Sources cited (even if not spoken)
[ ] Medical facts verified by expert
[ ] Appropriate for target audience level
[ ] No confidential patient information

Audio Optimization:

- [ ] Sentences 15-20 words maximum
[ ] Abbreviations expanded on first use
[ ] Complex terms have pronunciation guides
[ ] Active voice preferred over passive
[ ] Transitions smooth and conversational

Production Quality:

- [ ] Timing realistic for content density
[ ] Pacing cues appropriate for subject
[ ] Music/sound cues marked clearly
[ ] Pronunciation guide comprehensive
[ ] Script formatted for easy reading

Before Recording:

- [ ] CRITICAL: Script read aloud for flow
[ ] Difficult pronunciations practiced
[ ] Timing tested with stopwatch
[ ] Technical terms confirmed with subject expert
[ ] Copyright cleared for any quoted material

Common Pitfalls

Content Issues:

- ❌ Too dense → Information overload for listeners

- ✅ Break complex topics into multiple episodes

- ❌ Visual dependencies → "As shown in Figure 3..."

- ✅ Describe visuals or omit visual-dependent content

- ❌ Citation overload → Every sentence has reference

- ✅ Save citations for show notes, not narration

Audio Issues:

- ❌ Written-style language → "Furthermore, the aforementioned..."

- ✅ Conversational: "Plus, this thing we talked about..."

- ❌ No pauses → Relentless information delivery

- ✅ Build in breathing room; let points sink in

- ❌ Ignoring pronunciation → Mispronounced medical terms

- ✅ Research and practice all technical terms

Production Issues:

- ❌ Underestimating time → 10 minutes of script takes 12+ to record

- ✅ Add 20% buffer for retakes and natural pacing

- ❌ Complex sentence structures → Tongue twisters for narrator

- ✅ Short sentences; avoid nested clauses

References

Available in references/ directory:

- audio_writing_best_practices.md - Broadcast writing guidelines
INLINECODE6 - Common terms phonetics
INLINECODE7 - Industry format standards
INLINECODE8 - Inclusive audio content
INLINECODE9 - YouTube, Spotify, Apple specs
INLINECODE10 - Narrator health and performance

Scripts

Located in scripts/ directory:

- main.py - CLI interface for script conversion
INLINECODE13 - Core text-to-audio transformation
INLINECODE14 - Medical terminology phonetics
INLINECODE15 - Duration calculation and pacing
INLINECODE16 - Podcast, video, lecture templates
INLINECODE17 - Narrator cues and direction
INLINECODE18 - Alternative format generation

Limitations

- Voice Performance: Script is text only; actual delivery varies by narrator
Accent Variations: Pronunciation guides may not match all dialects
Cultural Context: Humor and references may not translate across cultures
Copyright: Cannot use copyrighted material without permission
Technical Accuracy: Does not verify medical content (input-dependent)
Live Elements: Cannot script unscripted interviews or Q&A

Parameters

Parameter	Type	Default	Required	Description
INLINECODE19, INLINECODE20	string	-	No	Input text file path
INLINECODE21, INLINECODE22

Usage

Basic Usage

CODEBLOCK12

Risk Assessment

Risk Indicator	Assessment	Level
Code Execution	Python script executed locally	Low
Network Access

Security Checklist

- [x] No hardcoded credentials or API keys
[x] No unauthorized file system access
[x] Output does not expose sensitive information
[x] Prompt injection protections in place
[x] Input validation for file paths
[x] Output directory restricted to workspace
[x] Script execution in sandboxed environment

Prerequisites

CODEBLOCK13

Evaluation Criteria

Success Metrics

- [x] Successfully converts text to audio-optimized script
[x] Expands abbreviations and converts numbers to words
[x] Calculates estimated duration based on word count
[x] Applies style-specific formatting
[x] Provides pronunciation notes for medical terms

Test Cases

1. Basic Conversion: Convert text file → Returns audio script with metadata
Abbreviation Handling: Text with "e.g., i.e., etc." → All expanded in output
Number Conversion: Input with "1 in 4" → Output with "one in four"

Lifecycle Status

- Current Stage: Draft
Next Review Date: 2026-03-06
Known Issues: None
Planned Improvements:

- Add support for custom abbreviation dictionaries - Integrate with text-to-speech engines - Add multilingual support

🎙️ Pro Tip: The best audio scripts sound natural when spoken. Always read your script aloud before finalizing—if you stumble over a sentence, your narrator will too. Revise for the ear, not the eye.

音频脚本撰写器

概述

内容转换工具，可将书面医学和科学材料转化为专业结构的音频脚本，适用于播客、教育视频、有声读物和配音旁白。

核心能力：

- 格式转换：研究论文 → 播客脚本
口语优化：为听力重构句子结构
发音指南：医学术语音标拼写
时长估算：为制作规划计算持续时间
多格式输出：播客、视频、讲座、有声读物模板
语音指导：为叙述者提供语调、节奏和强调提示

使用时机

✅ 使用此技能的场景：

- 根据期刊文章制作医学教育播客
将会议演示转换为视频脚本
开发医学教科书的有声读物版本
编写患者教育音频材料脚本
为社交媒体制作研究摘要视频
将书面病例报告改编为音频案例研究
为电子学习模块创建配音脚本

❌ 请勿使用此技能的场景：

- 无脚本的现场演示 → 使用即兴发挥
高度视觉化内容（手术视频） → 使用视觉聚焦工具
互动音频（问答形式） → 使用对话脚本工具
音乐或音效设计规划 → 使用音频制作软件
语音录制本身 → 此工具创建脚本，而非音频

集成：

- 上游：摘要生成器（内容浓缩）、通俗摘要生成器（患者友好语言）
下游：医学翻译（多语言脚本）、语音克隆工具（AI旁白）

核心能力

1. 口语转换

将书面文本转换为对话式音频风格：

python
from scripts.audio_writer import AudioScriptWriter

writer = AudioScriptWriter()

转换书面内容

script = writer.converttoaudio( sourcetext=researchpaper, format=podcast, # podcast, video, lecture, audiobook targetaudience=medicalstudents, duration_minutes=15 )

print(script.spoken_text)

转换：The pathophysiology of diabetes mellitus involves...

为：So what exactly happens in diabetes? Well, it all starts when...

转换规则：

书面风格	音频风格	示例
Furthermore	Plus	减少正式过渡词
et al.

2. 发音指南生成

为医学术语创建音标拼写：

python

生成发音指南

pronunciation = writer.createpronunciationguide(
text=script,
include_phonetic=True,
include_syllables=True
)

输出：

Hyperlipidemia: hi-per-lip-i-DEE-mee-uh

Metformin: met-FOR-min

Atherosclerosis: ath-er-oh-skleh-ROH-sis

指南元素：

- 音标拼写：国际音标或简化音标
音节划分：hy-per-ten-sion
重音标记：主重音（大写）、次重音
替代发音：地区变体（英式 vs 美式）
谐音提示：难词与...押韵

3. 时长与节奏

计算说话时长并标记节奏提示：

python

分析时长

timing = writer.calculate_timing(
script=script,
speaking_rate=conversational, # slow, conversational, fast
include_pauses=True
)

print(f预计时长：{timing.duration_minutes} 分钟)
print(f字数：{timing.word_count})
print(f语速：{timing.wordsperminute} 词/分钟)

语速：

风格	词/分钟	使用场景
慢速/教育型	120-130	患者教育、复杂主题
对话型

140-160 | 播客、普通受众 |
| 快速/新闻型 | 170-190 | 时间受限内容 |
| 可变型 | 变化 | 带停顿的动态节奏 |

节奏提示：

[呼吸] - 叙述者短暂停顿
[停顿2秒] - 两秒停顿以示强调
[放慢] - 关键点降低语速
[加快] - 增加能量/兴奋度
[节拍] - 戏剧性停顿

4. 多格式模板

为不同音频格式生成脚本：

python

播客剧集

podcast = writer.createpodcastscript(
content=article,
episode_format=interview, # solo, interview, panel
includeintromusic=True,
ad_breaks=[5, 12] # 分钟
)

教育视频

video = writer.createvideoscript( content=lecture_slides, visual_cues=True, # 标记视觉变化位置 brollnotes=True # 建议补充素材 )

格式类型：

格式	特点	最适合
播客	对话式、分段、广告	长内容、访谈
视频

常见模式

模式1：研究论文转播客

场景：将已发表研究转换为15分钟播客剧集。

bash

将论文转换为播客脚本

python scripts/main.py \
--input paper.pdf \
--format podcast \
--duration 15 \
--style conversational \
--include-intro-outro \
--output podcast_script.txt

生成发音指南

python scripts/main.py \ --input podcast_script.txt \ --generate-pronunciation \ --output pronunciation_guide.txt

结构：

[开场音乐5秒]

主持人：欢迎收听《今日医学研究》。我是主持人...

[呼吸]

主持人：今天我们将深入探讨一项关于...的精彩研究

[停顿]

主持人：那么研究人员发现了什么？嗯...

[呼吸]

主持人：研究作者之一史密斯博士解释说...

[录音片段：采访剪辑]

...

[结束音乐]

模式2：医学讲座录制

场景：将讲座笔记转换为在线课程视频脚本。

python

创建讲座脚本

lecture = writer.createlecturescript(
notes=lecture_content,
duration=45, # 分钟
break_intervals=[15, 30], # 学生休息时间（分钟）
interaction_points=True # 暂停思考...提示
)

添加视觉提示

script = writer.addvisualcues( script=lecture, slide_transitions=True, animation_notes=True )

讲座元素：

- 开头列出学习目标
定期理解检查
休息提醒
主题间过渡短语
总结和关键要点

模式3：患者教育音频

场景：创建糖尿病管理音频指南。

python

患者友好脚本

patientscript = writer.createpatient_script(
medicalcontent=diabetesguide,
reading_level=6, # 六年级水平
empathetic_tone=True,
keypointshighlighted=True
)

慢速、清晰节奏

patientscript.adjustpacing( wpm=130, pauseaftersentences=1.5 # 秒 )

患者脚本特点：

- 简单语言（避免医学术语）
共情语调
清晰行动步骤
安抚性陈述
关键点重复

模式4：会议演示转视频

场景：将现场演示改编为YouTube视频格式。

bash

转换演示脚本

python scripts/main.py \
--input presentation_transcript.txt \
--format video \
--platform youtube \
--include-hooks true \
--engagement-cues true \
--output youtube_script.txt

YouTube优化：

- 前30秒设置钩子
评论区的互动问题
行动号召（订阅、点赞）
章节时间戳标记
视觉兴趣的补充素材建议

完整工作流程示例

从研究论文到

audio-script-writer音频脚本撰写