autowriter — Automated Writing System
autoresearch's core is an agent loop: modify code → run → evaluate → keep/discard → loop.
autowriter maps this paradigm to writing, embedding "de-AI" into the loop itself — not post-processing after writing, but writing, purifying, evaluating, rewriting in every single iteration.
Design Philosophy
Three core principles from autoresearch, mapped to writing:
| autoresearch Principle | Writing Mapping | Mechanism |
|---|
| Automated loop | write → humanize → evaluate → rewrite loop | Agent Loop |
| Quantified evaluation |
6-dimension scoring function (with "human feel" dimension) | Phase 2 |
| Failure transparency | draft log records every discarded version | Draft Log |
Plus humanizer's core insight: De-AI is not post-processing polish, it's part of writing quality. The evaluation function detects AI patterns directly — rewrite if not passing, rewrite again, until clean.
Agent Loop (Core Flow)
CODEBLOCK0
--depth Knob
One parameter controls everything. No other knobs exposed.
| depth | Words | Technical Detail | Iterations | Use Case |
|---|
| 1 | 1500-2000 | Intuition-focused, minimal formulas | 2 | Quick takes, social posts |
| 2 |
2500-3500 | Code + data, moderate formulas | 3 | Standard blog articles |
| 3 | 4500-6000 | Deep technical + experimental data | 4 | Deep dives, paper explainers |
| 4 | 8000+ | Full tech stack, includes derivations | 5 | Tutorials, surveys |
Phase 0: Research
Facts first, then write. autoresearch reads train.py first — same principle.
Actions (user supplies all source material — this skill does not make network requests):
- 1. Paper → Read the user-provided PDF/URL/clipboard text, extract core contribution, method, experimental data, limitations
- Tech topic → Read the user-provided references, notes, or local files, extract key facts
- Project → Read user-provided source/docs within the workspace, extract architecture, design decisions, key code
Output: research_facts.md in the current workspace directory — structured fact checklist (not an outline, not "what goes in paragraph 1")
Important: If the user has not provided source material, ask them to supply it. Do not search the web or access files outside the workspace.
Phase 1: Write / Rewrite
First round: Initial draft
- - Write directly based on
research_facts.md, don't overthink structure - Write backwards: core discovery/code first, background later
- Allow bad writing — a draft is raw material for evaluation
- Built-in human feel constraints (see "Iron Rules"), but don't spend time polishing
Subsequent rounds: Targeted rewrite
- - Carry self-evaluation annotations from previous round
- Only fix lowest-scoring dimensions, don't rewrite everything
- Each round must show substantial change
Phase 1.5: De-AI (Humanize Pass)
This is the key step where humanizer mechanism is embedded into the loop. Not post-processing, but a mandatory checkpoint in every iteration.
Execution
Run AI pattern scan on Phase 1 output, check and rewrite each item:
Scan checklist (fast scan, not line-by-line):
- 1. Filler phrases — Remove opening bromides and emphasis crutches
- Kill: "It's worth noting," "As we all know," "Obviously," "Undoubtedly," "In this era of X"
- Kill: "To achieve this goal" → "To do this"
- Kill: Rhetorical questions ("So the question becomes...")
- 2. Overemphasis — Check for exaggerated significance
- Kill: "marks," "witnesses," "crucial," "indelible"
- Kill: "Not only... but also...," "This isn't just... it's..."
- 3. AI vocabulary blacklist — Replace with direct expression
- "Furthermore" → delete or use direct connection
- "Delve into" → "analyze" / "look at"
- "Demonstrates" → "shows" / delete
- "Dynamic," "rich," "profound" → specific description or delete
- "Ever-evolving landscape" → specific context
- 4. Structural patterns — Break formulas
- Rule of three → use two or four items instead
- Bold heading + colon list → blend into paragraphs
- Dash reveal → use direct statement
- Generic positive ending → specific next step or limitation
- 5. Voice injection — Add human touch
- Have opinions, don't just report facts
- Admit uncertainty ("I'm not sure," "Honestly")
- Mix sentence lengths (Short. Then a longer one that unfolds.)
- Allow tangents and half-formed thoughts
Speed control
De-AI scan must be fast. Not line-by-line proofreading, 5 minutes for a pass:
- - Run blacklist keyword grep first (10 seconds)
- Then fix structural issues (2 minutes)
- Finally inject voice (2 minutes)
Don't pursue perfection. Phase 2's evaluation function catches residual AI traces — if it doesn't pass, next round will handle it.
Phase 2: Self-Evaluation
6-dimension quantitative evaluation function. Each dimension 0-100.
| Dimension | Weight | 90+ Standard | Below 50 |
|---|
| Information density | 20% | Nearly every sentence carries new info | Heavy padding, transitions, repetition |
| Code/data ratio |
20% | Every core claim backed by code or data | Pure prose, no verifiable evidence |
|
Failure showcase | 15% | Includes "what didn't work" and specific reasons | Only shows success paths |
|
Conciseness | 15% | No paragraph removable without losing information | 30%+ content can be deleted |
|
Actionability | 15% | Reader can immediately verify after reading | Reader knows but can't act |
|
Human feel | 15% | Sounds like a real person, has opinions and emotion | AI-scented, formulaic structure |
Human feel dimension scoring
| Score | Standard |
|---|
| 90+ | Unique voice and personal opinions; varied sentence length; zero AI blacklist hits; no rule-of-three / negative parallelism |
| 70-89 |
Mostly natural, occasional AI traces acceptable; has opinions but not sharp enough |
| 50-69 | Formulaic structure, visible AI patterns; flat tone, no personality |
| Below 50 | Heavy AI vocabulary, rule-of-three, dash reveals, promotional language |
Composite score formula
CODEBLOCK1
Self-evaluation output format
CODEBLOCK2
Phase 3: Decision
CODEBLOCK3
Early termination
- - Two consecutive rounds with score difference < 5 → stop, take the higher-scoring version
- Max iterations reached → stop, take the highest-scoring version
Draft Log
Append after each evaluation, equivalent to autoresearch's results.tsv:
CODEBLOCK4
Draft log stays at the end of the article or as an attachment. Fully transparent, no secret recipe.
Storage limit: Only retain the current draft and the final draft_log summary. Discarded intermediate drafts are NOT saved to disk — only their scores and fix actions are recorded in the log table. This prevents accumulation of potentially sensitive content.
Iron Rules
Enforced on every write/rewrite. These rules fuse Karpathy style with humanizer principles:
Five Iron Rules
- 1. Show Don't Tell — Put code/data, not prose descriptions of effects
- One thing per paragraph — Delete any paragraph and information is lost
- Experiments first — No claims without code/data/search results backing them
- Record failures — Every article must include at least one "what didn't work"
- Zero filler — Kill all filler phrases, rhetorical questions, universal summary sentences
Language rules (built-in de-AI)
Use: First person, specific numbers, code snippets, colloquial tech language, admitting ignorance, mixed sentence lengths, opinionated reactions
Don't use:
- - "This article will introduce," "As we all know," "It's worth noting," "In this era of X"
- "Furthermore," "Delve into," "Demonstrates," "Dynamic," "Ever-evolving landscape"
- Rhetorical questions ("Why does this matter?" → just say why)
- Negative parallelism ("Not only... but also...")
- Rule-of-three lists (use two or four items)
- Bold heading + colon lists (blend into paragraphs)
- Generic positive endings ("What's exciting is..." → specific next step)
Voice injection
- - Have opinions. "Honestly I think this direction is flawed" > "This direction has certain limitations"
- Admit complexity. "I tried three approaches, first two bombed" > "After multiple experimental validations"
- Allow tangents. Real thinking isn't linear.
- Mixed rhythm. Short sentences. Then a longer one that unfolds slowly, with a turn, and lands.
Article Structure Selection
Automatically chosen based on --depth and content type, not forced into templates:
depth 1-2 (concise output)
- - Opening: one-sentence conclusion (result first)
- Core: code/data + the single most important finding
- Closing: limitation + one-sentence summary
depth 3-4 (deep output)
- - Opening: one-sentence conclusion
- Background: why this matters (<=3 sentences)
- Body: minimal runnable example → expand step by step → experimental data
- Failure: what didn't work + why
- Closing: code links + limitations
Structure is a result, not a constraint.
Skill Integration
- - agent-browser: If the user has already gathered research material via agent-browser, autowriter reads those results (workspace files only)
- WeChat article style guide: For WeChat publishing format requirements
No humanizer-zh post-processing needed. De-AI is built-in.
This skill does not initiate network requests. All source material must be user-provided.
Further Reading
- - autoresearch design philosophy → INLINECODE2
- Karpathy code style → writing style mapping → INLINECODE3
autowriter — 自动化写作系统
autoresearch的核心是一个智能体循环:修改代码 → 运行 → 评估 → 保留/丢弃 → 循环。
autowriter将这一范式映射到写作中,将去AI化嵌入循环本身——不是在写作后进行后处理,而是在每一次迭代中完成写作、净化、评估、重写。
设计理念
来自autoresearch的三个核心原则,映射到写作:
| autoresearch原则 | 写作映射 | 机制 |
|---|
| 自动化循环 | 写作 → 人性化 → 评估 → 重写循环 | 智能体循环 |
| 量化评估 |
六维度评分函数(含人类感维度) | 阶段二 |
| 失败透明 | 草稿日志记录每个被丢弃的版本 | 草稿日志 |
加上人性化工具的核心洞察:去AI化不是后处理润色,而是写作质量的一部分。 评估函数直接检测AI模式——不通过就重写,再重写,直到干净为止。
智能体循环(核心流程)
┌─────────────────────────────────────────────────┐
│ 输入:主题/论文/项目 + --depth N │
│ (N=1快速, N=2标准, N=3深度, N=4综述) │
└──────────────────────┬──────────────────────────┘
▼
┌────────────────┐
│ 阶段0:研究 │ 读取用户提供的资料
└───────┬────────┘
▼
┌──── 循环开始(最多N轮) ───┐
│ │
│ ┌────────────────────┐ │
│ │ 阶段1:写作 │ │
│ │ 生成完整草稿 │ │
│ │(内置人类感约束) │ │
│ └────────┬───────────┘ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ 阶段1.5:去AI化 │ │
│ │ 扫描+重写AI模式 │ │
│ └────────┬───────────┘ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ 阶段2:评估 │ │
│ │ 六维度量化评分 │ │
│ └────────┬───────────┘ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ 阶段3:决策 │ │
│ │ 分数>=80 → 保留 │ │
│ │ 分数<80 → 带注释 │ │
│ │ 重写 │ │
│ └────────────────────┘ │
│ │
└──────────────────────────┘
▼
┌────────────────┐
│ 输出最终文章 │
│ + 日志 │
└────────────────┘
--depth 旋钮
一个参数控制一切。不暴露其他旋钮。
| depth | 字数 | 技术细节 | 迭代次数 | 使用场景 |
|---|
| 1 | 1500-2000 | 直觉导向,极少公式 | 2 | 快速评论,社交媒体帖子 |
| 2 |
2500-3500 | 代码+数据,适度公式 | 3 | 标准博客文章 |
| 3 | 4500-6000 | 深度技术+实验数据 | 4 | 深度解析,论文解读 |
| 4 | 8000+ | 完整技术栈,包含推导 | 5 | 教程,综述 |
阶段0:研究
事实优先,然后写作。autoresearch先读取train.py——同样的原则。
操作(用户提供所有源材料——此技能不发起网络请求):
- 1. 论文 → 读取用户提供的PDF/URL/剪贴板文本,提取核心贡献、方法、实验数据、局限性
- 技术主题 → 读取用户提供的参考文献、笔记或本地文件,提取关键事实
- 项目 → 读取工作区内用户提供的源码/文档,提取架构、设计决策、关键代码
输出: 当前工作目录下的 research_facts.md——结构化事实清单(不是大纲,不是第一段写什么)
重要提示: 如果用户未提供源材料,请要求用户提供。不要搜索网络或访问工作区外的文件。
阶段1:写作/重写
第一轮:初稿
- - 直接基于 research_facts.md 写作,不要过度思考结构
- 倒着写:核心发现/代码优先,背景后置
- 允许写得差——草稿是评估的原材料
- 内置人类感约束(见铁律),但不要花时间润色
后续轮次:针对性重写
- - 携带上一轮的自我评估注释
- 只修复得分最低的维度,不要重写所有内容
- 每一轮必须展示实质性变化
阶段1.5:去AI化(人性化处理)
这是将人性化机制嵌入循环的关键步骤。不是后处理,而是每次迭代中的强制性检查点。
执行
对阶段1的输出运行AI模式扫描,检查并重写每一项:
扫描清单(快速扫描,非逐行检查):
- 1. 填充短语 — 移除开场套话和强调拐杖
- 删除:值得注意的是,众所周知,显然,毫无疑问,在这个X时代
- 删除:为了实现这一目标 → 为此
- 删除:反问句(那么问题就变成了……)
- 2. 过度强调 — 检查夸大的重要性
- 删除:标志着,见证了,关键的,不可磨灭的
- 删除:不仅……而且……,这不仅仅是……更是……
- 3. AI词汇黑名单 — 替换为直接表达
- 此外 → 删除或使用直接连接
- 深入探讨 → 分析/查看
- 表明 → 显示/删除
- 动态的,丰富的,深刻的 → 具体描述或删除
- 不断演变的格局 → 具体语境
- 4. 结构模式 — 打破公式
- 三点法则 → 改用两项或四项
- 粗体标题+冒号列表 → 融入段落
- 破折号揭示 → 使用直接陈述
- 通用正面结尾 → 具体的下一步或局限性
- 5. 注入声音 — 添加人情味
- 有观点,不要只报告事实
- 承认不确定性(我不确定,老实说)
- 混合句子长度(短的。然后一个展开的长句。)
- 允许离题和不成熟的想法
速度控制
去AI扫描必须快速。不是逐行校对,一次通过5分钟:
- - 先运行黑名单关键词grep(10秒)
- 然后修复结构问题(2分钟)
- 最后注入声音(2分钟)
不要追求完美。 阶段2的评估函数会捕捉残留的AI痕迹——如果不通过,下一轮会处理。
阶段2:自我评估
六维度量化评估函数。每个维度0-100分。
| 维度 | 权重 | 90+标准 | 低于50分 |
|---|
| 信息密度 | 20% | 几乎每句都携带新信息 | 大量填充、过渡、重复 |
| 代码/数据比例 |
20% | 每个核心主张都有代码或数据支撑 | 纯文字,无可验证证据 |
|
失败展示 | 15% | 包含什么没成功及具体原因 | 只展示成功路径 |
|
简洁性 | 15% | 没有段落可删除而不丢失信息 | 30%+内容可删除 |
|
可操作性 | 15% | 读者阅读后可立即验证 | 读者知道但无法行动 |
|
人类感 | 15% | 听起来像真人,有观点和情感 | 有AI味道,结构公式化 |
人类感维度评分
| 分数 | 标准 |
|---|
| 90+ | 独特声音和个人观点;句子长度多样;零AI黑名单命中;无三点法则/否定排比 |
| 70-89 |
基本自然,偶尔的AI痕迹可接受;有观点但不够尖锐 |
| 50-69 | 公式化结构,可见AI模式;语气平淡,无个性 |
| 低于50 | 大量AI词汇,三点法则,破折号揭示,宣传性语言 |
综合评分公式
score = 信息密度0.20 + 代码数据比例0.20 + 失败展示*0.15
+