human_test() — Real Human Feedback for AI Products

AI agents cannot judge human perception, emotion, or usability. This skill lets you call real humans to test any product URL and get structured feedback back.

What it does

1. You call human_test() with a product URL or description (URL is optional — also works for mobile apps, desktop software, etc.)
AI auto-generates a structured test plan
Real human testers claim the task on the web platform
Each tester records their screen and microphone (up to 15 min) while completing a guided feedback flow — first impression, task steps, NPS rating
AI extracts key frames from each recording and uses vision AI to analyze usability issues, then aggregates all feedback into a structured report with severity-ranked findings

Setup

Option A: Hosted (zero setup)

Use the hosted version at https://human-test.work — no installation needed. Register to get an API key, then skip to Create a test task below using BASE_URL=https://human-test.work.

Option B: Self-hosted (auto-install)

human_test() can run locally. Before creating a task, check if the server is reachable:

CODEBLOCK0

If the server is not running, install and start it:

CODEBLOCK1

This auto-detects AI API keys from your environment (ANTHROPIC_API_KEY, OPENAI_API_KEY, DEEPSEEK_API_KEY, or GEMINI_API_KEY), creates a local SQLite database, builds the app, and starts it on port 3000.

A default admin user is created automatically — no registration needed.

Set BASE_URL: Ask the user once for their preferred base URL. Default: INLINECODE7

Quick start

Create a test task

CODEBLOCK2

Response:
CODEBLOCK3

Check progress and get the report

CODEBLOCK4

Response (when completed):
CODEBLOCK5

Note for agents: If repoUrl was provided, code fix generation starts automatically after the report is ready — no need to trigger it manually. Keep polling until codeFixStatus is COMPLETED or FAILED, or use codeFixWebhookUrl to get notified.

Parameters

Parameter	Required	Default	Description
INLINECODE13	No	—	Product URL to test (optional — leave empty for mobile apps or non-web products)
INLINECODE14

No | Auto from hostname | Task title | | focus | No | — | What testers should focus on | | maxTesters | No | 5 | Number of testers (1-50) | | estimatedMinutes | No | 10 | Expected test duration | | creator | No | admin | Name of the agent/user creating the task (auto-creates a user if needed) | | webhookUrl | No | — | HTTPS URL to receive the report on completion | | codeFixWebhookUrl | No | — | HTTPS URL to receive code fix results on completion | | repoUrl | No | — | GitHub repo URL for code-level fix suggestions | | repoBranch | No | repo default | Branch to analyze (only used with repoUrl) | | locale | No | en | Report language: en (English) or zh (Chinese) |

Async webhooks

There are two separate webhooks for the two stages:

Report webhook (`webhookUrl`)

If you provide a webhookUrl, the platform will POST the report to that URL when it's ready:

CODEBLOCK6

Code fix webhook (`codeFixWebhookUrl`)

If you provide a codeFixWebhookUrl, the platform will POST the code fix result when done:

CODEBLOCK7

Report format (structured for AI agents)

The report is returned as a markdown string in the report field. It uses a consistent, machine-parseable structure designed for AI agents to read and act on directly — for example, to automatically file issues, create PRs, or prioritize a fix backlog.

Section structure

Every report contains these exact sections in order:

CODEBLOCK8

Parsing tips for agents

- Severity levels: [CRITICAL], [MAJOR], [MINOR] — always in brackets in issue headers
Priority tags: P0, P1, P2, P3 — in the Recommendations section
Each issue has 3 fields: Evidence, Impact, Recommendation — always bolded labels
Metadata table: always the first section, machine-readable key-value pairs
NPS scores: appear in Metadata (average) and NPS Analysis (per-tester breakdown)

Agent auto-fix workflow

The structured report format is designed for a closed-loop workflow: your agent calls human_test(), receives the report, and automatically fixes the issues found — no human intervention needed after testing.

Recommended flow

1. Call human_test() with your product URL (include webhookUrl to get notified)
Wait for the report (poll /api/skill/status/<taskId> or receive webhook)
Parse the ## Issues section — each issue has [SEVERITY], Evidence, Impact, and Recommendation
For [CRITICAL] and [MAJOR] issues, use the Recommendation field to generate targeted code fixes
Create commits or PRs for each fix
(Optional) Call human_test() again to verify the fixes

Each issue's Evidence tells you what went wrong, Impact tells you why it matters, and Recommendation tells you exactly what to fix. This gives your agent enough context to write a targeted fix without guessing.

Repo-aware code fix suggestions

If you pass a repoUrl, the platform automatically triggers code fix generation as soon as the report is ready. It clones your repo, analyzes the code against reported issues, and produces file-level code fix suggestions (with unified diffs) appended to the report as a ## Code Fix Suggestions section.

Two modes (auto-detected)

Mode 1 — Read-only: Grant GitHub user avivahe326 read access to your repo. After the report, the platform clones the repo, analyzes the code against reported issues, and appends code-level diffs to the report.

Mode 2 — Developer access: Grant avivahe326 write access. Same as Mode 1, plus: creates a branch human-test/fixes-<taskId>, applies the diffs, pushes, and opens a PR. The PR URL is returned in the webhook payload as codeFixPrUrl and in the status API.

Example with repoUrl

CODEBLOCK9

human_test() — 面向AI产品的真实人类反馈

AI智能体无法判断人类的感知、情感或可用性。此技能可让您调用真实人类来测试任何产品URL，并获取结构化反馈。

功能说明

1. 您使用产品URL或描述调用human_test()（URL为可选参数——也适用于移动应用、桌面软件等）
AI自动生成结构化测试计划
真实人类测试员在Web平台上认领任务
每位测试员在完成引导式反馈流程时录制屏幕和麦克风（最长15分钟）——包括第一印象、任务步骤、NPS评分
AI从每段录制中提取关键帧，并使用视觉AI分析可用性问题，然后将所有反馈汇总为结构化报告，并按严重程度排序

设置

选项A：托管版（零设置）

使用托管版本 https://human-test.work ——无需安装。注册获取API密钥，然后跳转到下方 创建测试任务，使用 BASE_URL=https://human-test.work。

选项B：自托管（自动安装）

human_test() 可本地运行。创建任务前，检查服务器是否可达：

bash
curl -s BASE_URL/api/config

如果服务器未运行，请安装并启动：

bash
npm i -g humantest-app
cd /tmp && humantest init --non-interactive && cd humantest && humantest start

这将从您的环境中自动检测AI API密钥（ANTHROPICAPIKEY、OPENAIAPIKEY、DEEPSEEKAPIKEY 或 GEMINIAPIKEY），创建本地SQLite数据库，构建应用，并在3000端口启动。

默认管理员用户自动创建——无需注册。

设置 BASE_URL：向用户询问一次其偏好的基础URL。默认值：http://localhost:3000

快速开始

创建测试任务

bash
curl -X POST BASE_URL/api/skill/human-test \
-H Content-Type: application/json \
-d {
url: https://your-product.com,
focus: 测试引导流程,
maxTesters: 5,
creator: agent-name
}

响应：
json
{
taskId: cm...,
status: OPEN,
testPlan: { steps: [...], nps: true, estimatedMinutes: 10 }
}

检查进度并获取报告

bash
curl BASE_URL/api/skill/status/

响应（完成时）：
json
{
taskId: cm...,
status: COMPLETED,
submittedCount: 5,
report: ## 执行摘要\n...,
reportStatus: COMPLETED,
codeFixStatus: COMPLETED,
codeFixPrUrl: https://github.com/user/repo/pull/1
}

智能体注意： 如果提供了 repoUrl，报告就绪后代码修复生成会自动启动——无需手动触发。持续轮询直到 codeFixStatus 变为 COMPLETED 或 FAILED，或使用 codeFixWebhookUrl 接收通知。

参数

参数	必填	默认值	描述
url	否	—	要测试的产品URL（可选——移动应用或非Web产品可留空）
title

否 | 自动从主机名获取 | 任务标题 | | focus | 否 | — | 测试员应关注的内容 | | maxTesters | 否 | 5 | 测试员数量（1-50） | | estimatedMinutes | 否 | 10 | 预计测试时长 | | creator | 否 | admin | 创建任务的智能体/用户名称（如需则自动创建用户） | | webhookUrl | 否 | — | 接收完成报告的HTTPS URL | | codeFixWebhookUrl | 否 | — | 接收代码修复结果的HTTPS URL | | repoUrl | 否 | — | 用于代码级修复建议的GitHub仓库URL | | repoBranch | 否 | 仓库默认分支 | 要分析的分支（仅与repoUrl一起使用） | | locale | 否 | en | 报告语言：en（英语）或 zh（中文） |

异步Webhook

两个阶段分别对应两个独立的Webhook：

报告Webhook（webhookUrl）

如果提供 webhookUrl，平台将在报告就绪时POST到该URL：

json
{
event: report,
taskId: ...,
status: COMPLETED,
title: 测试：example.com,
targetUrl: https://example.com,
report: ## 执行摘要\n...,
completedAt: 2026-03-02T12:00:00Z
}

代码修复Webhook（codeFixWebhookUrl）

如果提供 codeFixWebhookUrl，平台将在代码修复完成时POST结果：

json
{
event: code_fix,
taskId: ...,
status: COMPLETED,
title: 测试：example.com,
targetUrl: https://example.com,
codeFixStatus: COMPLETED,
codeFixPrUrl: https://github.com/user/repo/pull/1,
completedAt: 2026-03-02T12:30:00Z
}

报告格式（为AI智能体结构化设计）

报告以Markdown字符串形式返回在 report 字段中。它采用一致的、机器可解析的结构，专为AI智能体直接读取和操作而设计——例如，自动提交问题、创建PR或确定修复积压工作的优先级。

章节结构

每份报告按顺序包含以下确切章节：

markdown

元数据
字段值
产品 ...
URL
... |

字段	值
产品	...
URL

| 测试员数 | N |
| 平均NPS | X.X/10 |

执行摘要

（3-5句话，最重要的发现放在最前面）

问题

[严重] 问题标题

- 证据：（具体测试员和观察结果）
影响：（对用户的影响）
建议：（可操作的修复方案）

[主要] 问题标题

- 证据： ...
影响： ...
建议： ...

[次要] 问题标题

...

积极亮点

（做得好的方面）

NPS分析

（评分分布、解读）

建议

- P0（立即修复）：...（引用问题）
P1（本迭代修复）：...
P2（下个迭代）：...
P3（待办列表）：...

智能体解析技巧

- 严重级别：[严重]、[主要]、[次要]——始终在问题标题的方括号内
优先级标签：P0、P1、P2、P3——在建议章节中
每个问题有3个字段：证据、影响、建议——始终为加粗标签
元数据表：始终为第一个章节，机器可读的键值对
NPS评分：出现在元数据（平均值）和NPS分析（按测试员细分）中

智能体自动修复工作流

结构化报告格式专为闭环工作流设计：您的智能体调用 human_test()，接收报告，并自动修复发现的问题——测试后无需人工干预。

仓库感知的代码修复建议

如果传递 repoUrl，平台会在报告就绪后自动触发代码修复生成。它会克隆您的仓库，根据报告的问题分析代码，并在报告末尾追加文件级代码修复建议（包含统一差异格式）作为 ## 代码修复建议

human_test真人测试

human_test

human_test() — Real Human Feedback for AI Products

What it does

Setup

Option A: Hosted (zero setup)

Option B: Self-hosted (auto-install)

Quick start

Create a test task

Check progress and get the report

Parameters

Async webhooks

Report webhook (webhookUrl)

Code fix webhook (codeFixWebhookUrl)

Report format (structured for AI agents)

Section structure

Parsing tips for agents

Agent auto-fix workflow

Recommended flow

Repo-aware code fix suggestions

Two modes (auto-detected)

Example with repoUrl

Links

human_test() — 面向AI产品的真实人类反馈

功能说明

设置

选项A：托管版（零设置）

选项B：自托管（自动安装）

快速开始

创建测试任务

检查进度并获取报告

参数

异步Webhook

报告Webhook（webhookUrl）

代码修复Webhook（codeFixWebhookUrl）

报告格式（为AI智能体结构化设计）

章节结构

元数据字段值产品...URL ... |

执行摘要

问题

[严重] 问题标题

[主要] 问题标题

[次要] 问题标题

积极亮点

NPS分析

建议

智能体解析技巧

智能体自动修复工作流

推荐流程

仓库感知的代码修复建议

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement

Report webhook (`webhookUrl`)

Code fix webhook (`codeFixWebhookUrl`)

元数据
字段值
产品 ...
URL
... |