iOS Keyboard Extension Limitations
When building iOS custom keyboards with voice/audio features, these are the hard limitations discovered through the PolyVoice project.
🔴 Hard Limitations (Cannot be worked around)
1. Microphone Access — DISALLOWED
Keyboard extensions cannot access the microphone.
- -
AVAudioRecorder will fail with permission error - INLINECODE1 is unavailable
- No Siri integration from keyboard context
Why: Apple security model — keyboards run in sandbox and could keylog audio.
2. Open Other Apps — BLOCKED
Keyboards cannot programmatically open the main app or any other app.
- -
UIApplication.shared.open() returns false - URL schemes don't work (
myapp://) - INLINECODE4 not available
Why: Prevents malicious keyboards from launching apps without user consent.
3. Memory Limit — ~50MB
Keyboard extensions have strict memory limits (~30-60MB).
- - App terminated silently if exceeded
- No crash log, just disappears
- Heavy audio processing = instant death
Mitigation:
- - Record at 16kHz mono (not 44.1kHz)
- Use 32kbps bitrate max
- Immediate file cleanup after processing
- 60-second max recording hard limit
4. No Persistent Storage
UserDefaults unavailable, only App Groups.
- - Standard
UserDefaults doesn't persist - Must use INLINECODE6
- Requires App Group capability in both targets
5. Network Requires "Full Access"
API calls fail without user enabling "Allow Full Access" in Settings.
- - User must explicitly enable: Settings → General → Keyboard → [Keyboard Name] → Allow Full Access
- Most users won't do this
- Cannot prompt or explain from keyboard UI effectively
🟡 Partial Workarounds (User friction)
The "Open App" Workaround
Goal: Let user tap a button to open main app for recording.
Attempt:
CODEBLOCK0
Reality: Must use UIApplication.shared.open() outside extension context, but keyboards can't call this.
The Manual Switch Pattern
What actually works (with friction):
- 1. User taps button in keyboard → Shows alert: "Open PolyVoice to record?"
- User manually switches to main app (Home button, swipe, etc.)
- Main app detects active session (via App Groups / shared state)
- Main app auto-records on appear
- Auto-stops on silence (2 seconds)
- Auto-copies to clipboard
- User manually switches back to target app
- Keyboard auto-pastes on reappear
User flow:
CODEBLOCK1
Friction points:
- - Two manual app switches
- Context switching breaks flow
- Users forget to return
- Clipboard may be overwritten
🟢 Alternative Architectures
Option 1: Share Extension (Better for Audio)
Use Share Sheet instead of keyboard.
- - Full app capabilities
- Can record audio
- Can process and return text
Limitation: Not a keyboard — user must open share sheet per text field.
Option 2: Full App Mode
Don't use keyboard extension — use main app only.
- - User opens app
- Records dictation
- Copies result
- Switches to target app
- Pastes manually
Benefit: No memory limits, full mic access, reliable.
Cost: More friction than keyboard.
Option 3: Siri Shortcuts Integration
Provide Siri Shortcuts for voice-to-text.
- - "Hey Siri, dictate with PolyVoice"
- Returns text to current app
- Fully supported by Apple
Limitation: Not instant, requires Siri setup.
📊 Decision Matrix
| Approach | Mic Access | Memory | User Friction | Apple Approved |
|---|
| Keyboard extension | ❌ No | ⚠️ 50MB | Low (if no audio) | ✅ Yes |
| Keyboard + audio workaround |
❌ No | ⚠️ 50MB | 🔴 High | ✅ Yes |
| Share extension | ✅ Yes | ✅ Full | 🟡 Medium | ✅ Yes |
| Full app only | ✅ Yes | ✅ Full | 🟡 Medium | ✅ Yes |
| Siri Shortcuts | ✅ Yes | ✅ Full | 🟡 Medium | ✅ Yes |
🎯 Recommendation
For voice dictation/AI transcription:
- 1. Don't build a keyboard extension — the limitations make it frustrating
- Use Share Extension — Apple-supported, full capabilities
- Or full app — simplest to build, most reliable
- Add Shortcuts — for power users who want speed
For non-audio keyboards (emoji, translation, etc.):
Keyboard extension works great. Just avoid audio features.
📚 References
- - Apple's official docs: https://developer.apple.com/documentation/uikit/keyboardsandinput/creatingacustom_keyboard
- Custom Keyboard Programming Guide (WWDC sessions)
- PolyVoice project learnings (~/Projects/polyvoice-keyboard/)
iOS 键盘扩展限制
在构建具有语音/音频功能的 iOS 自定义键盘时,以下是通过 PolyVoice 项目发现的硬性限制。
🔴 硬性限制(无法绕过)
1. 麦克风访问 — 不允许
键盘扩展无法访问麦克风。
- - AVAudioRecorder 会因权限错误而失败
- SFSpeechRecognizer 不可用
- 键盘上下文中无法集成 Siri
原因: Apple 安全模型 — 键盘在沙盒中运行,可能记录音频按键。
2. 打开其他应用 — 被阻止
键盘无法以编程方式打开主应用或任何其他应用。
- - UIApplication.shared.open() 返回 false
- URL 方案无效(myapp://)
- ExtensionContext.open() 不可用
原因: 防止恶意键盘在未经用户同意的情况下启动应用。
3. 内存限制 — 约 50MB
键盘扩展有严格的内存限制(约 30-60MB)。
- - 超出限制后应用静默终止
- 无崩溃日志,直接消失
- 大量音频处理 = 立即崩溃
缓解措施:
- - 以 16kHz 单声道录制(而非 44.1kHz)
- 最大使用 32kbps 比特率
- 处理后立即清理文件
- 硬性限制最长录制 60 秒
4. 无持久存储
UserDefaults 不可用,仅支持 App Groups。
- - 标准 UserDefaults 无法持久化
- 必须使用 UserDefaults(suiteName: group.com.company.app)
- 两个目标都需要启用 App Group 功能
5. 网络需要“完全访问”
如果用户未在设置中启用“允许完全访问”,API 调用将失败。
- - 用户必须明确启用:设置 → 通用 → 键盘 → [键盘名称] → 允许完全访问
- 大多数用户不会这样做
- 无法从键盘 UI 有效提示或解释
🟡 部分变通方案(用户摩擦)
“打开应用”变通方案
目标: 让用户点击按钮打开主应用进行录制。
尝试:
swift
// 这不起作用
extensionContext?.open(URL(string: myapp://record)!)
现实: 必须在扩展上下文之外使用 UIApplication.shared.open(),但键盘无法调用此方法。
手动切换模式
实际有效的方法(有摩擦):
- 1. 用户在键盘中点击按钮 → 显示提示:“打开 PolyVoice 录制?”
- 用户手动切换到主应用(Home 键、滑动等)
- 主应用检测到活动会话(通过 App Groups / 共享状态)
- 主应用在出现时自动录制
- 静音时自动停止(2 秒)
- 自动复制到剪贴板
- 用户手动切换回目标应用
- 键盘在重新出现时自动粘贴
用户流程:
键盘 → 点击麦克风 → [手动:切换到应用] → 应用自动录制 →
[手动:切换回来] → 键盘自动粘贴
摩擦点:
- - 两次手动应用切换
- 上下文切换打断流程
- 用户忘记返回
- 剪贴板可能被覆盖
🟢 替代架构
选项 1:共享扩展(更适合音频)
使用共享表单代替键盘。
限制: 不是键盘 — 用户必须为每个文本字段打开共享表单。
选项 2:完整应用模式
不使用键盘扩展 — 仅使用主应用。
- - 用户打开应用
- 录制听写
- 复制结果
- 切换到目标应用
- 手动粘贴
优点: 无内存限制,完整麦克风访问,可靠。
代价: 比键盘更繁琐。
选项 3:Siri 快捷指令集成
提供语音转文本的 Siri 快捷指令。
- - “嘿 Siri,用 PolyVoice 听写”
- 将文本返回到当前应用
- Apple 完全支持
限制: 非即时,需要设置 Siri。
📊 决策矩阵
| 方法 | 麦克风访问 | 内存 | 用户摩擦 | Apple 批准 |
|---|
| 键盘扩展 | ❌ 否 | ⚠️ 50MB | 低(无音频时) | ✅ 是 |
| 键盘 + 音频变通方案 |
❌ 否 | ⚠️ 50MB | 🔴 高 | ✅ 是 |
| 共享扩展 | ✅ 是 | ✅ 完整 | 🟡 中 | ✅ 是 |
| 仅完整应用 | ✅ 是 | ✅ 完整 | 🟡 中 | ✅ 是 |
| Siri 快捷指令 | ✅ 是 | ✅ 完整 | 🟡 中 | ✅ 是 |
🎯 建议
对于语音听写/AI 转录:
- 1. 不要构建键盘扩展 — 限制使其令人沮丧
- 使用共享扩展 — Apple 支持,功能完整
- 或完整应用 — 构建最简单,最可靠
- 添加快捷指令 — 适合追求速度的高级用户
对于非音频键盘(表情符号、翻译等):
键盘扩展效果很好。只需避免音频功能。
📚 参考
- - Apple 官方文档:https://developer.apple.com/documentation/uikit/keyboardsandinput/creatingacustom_keyboard
- 自定义键盘编程指南(WWDC 会议)
- PolyVoice 项目经验(~/Projects/polyvoice-keyboard/)