Content Moderation

Moderate user-generated content using Vettly's AI-powered content moderation API. This skill uses the @vettly/mcp MCP server to check text, images, and video against configurable moderation policies with auditable decisions.

Setup

Add the @vettly/mcp MCP server to your configuration:

CODEBLOCK0

Get an API key at vettly.dev.

Available Tools

`moderate_content`

Check text, image, or video content against a Vettly moderation policy. Returns a safety assessment with category scores, the action taken, provider used, latency, and cost.

Parameters:

- content (required) - The content to moderate (text string, or URL for images/video)
INLINECODE4 (required) - The policy ID to use for moderation
INLINECODE5 (optional, default: text) - Type of content: text, image, or INLINECODE9

`validate_policy`

Validate a Vettly policy YAML without saving it. Returns validation results with any syntax or configuration errors. Use this to test policy changes before deploying them.

Parameters:

- yamlContent (required) - The YAML policy content to validate

`list_policies`

List all moderation policies available in your Vettly account. Takes no parameters. Use this to discover available policy IDs before moderating content.

`get_usage_stats`

Get usage statistics for your Vettly account including request counts, costs, and moderation outcomes.

Parameters:

- days (optional, default: 30) - Number of days to include in statistics (1-365)

`get_recent_decisions`

Get recent moderation decisions with optional filtering by outcome, content type, or policy.

Parameters:

- limit (optional, default: 10) - Number of decisions to return (1-50)
INLINECODE19 (optional) - Filter to only flagged content (true) or safe content (false)
INLINECODE22 (optional) - Filter by specific policy ID
INLINECODE23 (optional) - Filter by content type: text, image, or INLINECODE26

When to Use

- Moderate user-generated content (comments, posts, uploads) before publishing
Test and validate moderation policy YAML configs during development
Audit recent moderation decisions to review flagged content
Monitor moderation costs and usage across your account
Compare moderation results across different policies

Examples

Moderate a user comment

CODEBLOCK1

Call list_policies to find available policies, then moderate_content with the appropriate policy ID and return the safety assessment.

Validate a policy before deploying

CODEBLOCK2

Call validate_policy and report any syntax or configuration errors.

Review recent flagged content

CODEBLOCK3

Call get_recent_decisions with flagged: true to retrieve recent moderation decisions that were flagged.

Tips

- Always call list_policies first if you don't know which policy ID to use
Use validate_policy to test policy changes before deploying to production
Use get_usage_stats to monitor costs and catch unexpected spikes
Filter get_recent_decisions by contentType or policyId to narrow results
For image and video moderation, pass the content URL rather than raw data

内容审核

使用Vettly的AI驱动内容审核API对用户生成内容进行审核。此技能通过@vettly/mcp MCP服务器检查文本、图片和视频，依据可配置的审核策略执行审核，并生成可审计的决策记录。

配置

将@vettly/mcp MCP服务器添加到您的配置中：

json
{
mcpServers: {
vettly: {
command: npx,
args: [-y, @vettly/mcp],
env: {
VETTLYAPIKEY: your-api-key
}
}
}
}

在vettly.dev获取API密钥。

可用工具

moderate_content

根据Vettly审核策略检查文本、图片或视频内容。返回包含类别评分、执行操作、所用提供商、延迟和成本的安全评估结果。

参数：

- content（必填）- 待审核的内容（文本字符串，或图片/视频的URL）
policyId（必填）- 用于审核的策略ID
contentType（可选，默认值：text）- 内容类型：text、image或video

validate_policy

验证Vettly策略YAML但不保存。返回包含语法或配置错误的验证结果。在部署前使用此工具测试策略更改。

参数：

- yamlContent（必填）- 待验证的YAML策略内容

list_policies

列出Vettly账户中所有可用的审核策略。无需参数。在审核内容前使用此工具发现可用的策略ID。

getusagestats

获取Vettly账户的使用统计信息，包括请求数量、成本和审核结果。

参数：

- days（可选，默认值：30）- 统计中包含的天数（1-365）

getrecentdecisions

获取最近的审核决策，可按结果、内容类型或策略进行筛选。

参数：

- limit（可选，默认值：10）- 返回的决策数量（1-50）
flagged（可选）- 仅筛选标记内容（true）或安全内容（false）
policyId（可选）- 按特定策略ID筛选
contentType（可选）- 按内容类型筛选：text、image或video

使用场景

- 在发布前审核用户生成内容（评论、帖子、上传文件）
在开发期间测试和验证审核策略YAML配置
审计最近的审核决策以审查标记内容
监控账户范围内的审核成本和使用情况
比较不同策略下的审核结果

示例

审核用户评论

根据我的社区论坛策略审核这条用户评论：
我讨厌这个产品，这是我用过的最糟糕的东西，开发者应该感到羞愧

调用listpolicies查找可用策略，然后使用适当的策略ID调用moderatecontent并返回安全评估结果。

部署前验证策略

验证此审核策略YAML：

categories:
- name: toxicity
threshold: 0.8
action: flag
- name: spam
threshold: 0.6
action: block

调用validate_policy并报告任何语法或配置错误。

查看最近标记的内容

显示上周所有被标记的内容

使用flagged: true调用getrecentdecisions，检索最近被标记的审核决策。

提示

- 如果不确定使用哪个策略ID，请始终先调用listpolicies
在部署到生产环境前，使用validatepolicy测试策略更改
使用getusagestats监控成本并发现异常峰值
按contentType或policyId筛选getrecentdecisions以缩小结果范围
对于图片和视频审核，传递内容URL而非原始数据

content-moderation内容审核

content-moderation

Content Moderation

Setup