Moss Agent Skills
Capabilities
Moss is the real-time semantic search runtime for conversational AI. It delivers sub-10ms lookups and instant index updates that run in the browser, on-device, or in the cloud - wherever your agent lives. Agents can create indexes, embed documents, perform semantic/hybrid searches, and manage document lifecycles without managing infrastructure. The platform handles embedding generation, index persistence, and optional cloud sync - allowing agents to focus on retrieval logic rather than infrastructure.
Skills
Index Management
- - Create Index: Build a new semantic index with documents and embedding model selection
- Load Index: Load an existing index from persistent storage for querying
- Get Index: Retrieve metadata about a specific index (document count, model, etc.)
- List Indexes: Enumerate all indexes under a project
- Delete Index: Remove an index and all associated data
Document Operations
- - Add Documents: Insert or upsert documents into an existing index with optional metadata
- Get Documents: Retrieve stored documents by ID or fetch all documents
- Delete Documents: Remove specific documents from an index by their IDs
Search & Retrieval
- - Semantic Search: Query using natural language with vector similarity matching
- Keyword Search: Use BM25-based keyword matching for exact term lookups
- Hybrid Search: Blend semantic and keyword search with configurable alpha weighting (Python SDK)
- Metadata Filtering: Constrain results by document metadata (category, language, tags)
- Top-K Results: Return configurable number of best-matching documents with scores
Embedding Models
- - moss-minilm: Fast, lightweight model optimized for edge/offline use (default)
- moss-mediumlm: Higher accuracy model with reasonable performance for precision-critical use cases
SDK Methods
| JavaScript | Python | Description |
|---|
| INLINECODE0 | INLINECODE1 | Create index with documents |
| INLINECODE2 |
load_index() | Load index from storage |
|
getIndex() |
get_index() | Get index metadata |
|
listIndexes() |
list_indexes() | List all indexes |
|
deleteIndex() |
delete_index() | Delete an index |
|
addDocs() |
add_docs() | Add/upsert documents |
|
getDocs() |
get_docs() | Retrieve documents |
|
deleteDocs() |
delete_docs() | Remove documents |
|
query() |
query() | Semantic / hybrid search |
API Actions
All REST API operations go through POST /v1/manage (base URL: https://service.usemoss.dev/v1) with an action field:
| Action | Purpose | Extra required fields |
|---|
| INLINECODE21 | Get a presigned URL to upload index data | INLINECODE22 , modelId, docCount, INLINECODE25 |
| INLINECODE26 |
Trigger an index build after uploading data |
jobId |
|
getJobStatus | Check the status of an async build job |
jobId |
|
getIndex | Fetch metadata for a single index |
indexName |
|
listIndexes | Enumerate every index under the project | — |
|
deleteIndex | Remove an index record and assets |
indexName |
|
getIndexUrl | Get download URLs for a built index |
indexName |
|
addDocs | Upsert documents into an existing index |
indexName,
docs |
|
deleteDocs | Remove documents by ID |
indexName,
docIds |
|
getDocs | Retrieve stored documents (without embeddings) |
indexName |
Workflows
Basic Semantic Search Workflow
- 1. Initialize MossClient with project credentials
- Call
createIndex() with documents and model options ({ modelId: 'moss-minilm' } in JS; "moss-minilm" string in Python) - Call
loadIndex() to prepare index for queries - Call
query() with search text and topK (JS) or QueryOptions(top_k=...) (Python) - Process returned documents with scores
Hybrid Search Workflow (Python)
Hybrid blending via alpha is available in the Python SDK via QueryOptions:
- 1. Create and load index as above
- Call
query() with a QueryOptions object specifying INLINECODE56 - INLINECODE57 = pure semantic,
alpha=0.0 = pure keyword, alpha=0.6 = 60/40 blend - Default is semantic-heavy for conversational use cases
Document Update Workflow
- 1. Initialize client and ensure index exists
- Call
addDocs() with new documents (upserts by default — existing IDs are updated) - Call
deleteDocs() to remove outdated documents by ID
Voice Agent Context Injection Workflow
This is an opt-in integration pattern for voice agent pipelines — it is not automatic behavior of this skill.
- 1. Initialize MossClient and load index at agent startup
- In your application code, call
query() on each user message to retrieve relevant context - Inject search results into the LLM context before generating a response
- Respond with knowledge-grounded answer (no tool-calling latency)
Offline-First Search Workflow
- 1. Create index with documents using local embedding model
- Load index from local storage
- Query runs entirely on-device with sub-10ms latency
- Optionally sync to cloud for backup and sharing
Integration
Voice Agent Frameworks
- - LiveKit: Context injection into voice agent pipeline with
inferedge-moss SDK - Pipecat: Pipeline processor via
pipecat-moss package that auto-injects retrieval results
Context
Authentication
SDK requires project credentials:
- -
MOSS_PROJECT_ID: Project identifier from Moss Portal - INLINECODE66 : Project access key from Moss Portal
``bash theme={null}
export MOSS_PROJECT_ID=your_project_id
export MOSS_PROJECT_KEY=your_project_key
CODEBLOCK0 bash theme={null}
curl -X POST "https://service.usemoss.dev/v1/manage" \
-H "Content-Type: application/json" \
-H "x-service-version: v1" \
-H "x-project-key: moss_access_key_xxxxx" \
-d '{"action": "listIndexes", "projectId": "project_123"}'
CODEBLOCK1 typescript theme={null}
interface DocumentInfo {
id: string; // Required: unique identifier
text: string; // Required: content to embed and search
metadata?: object; // Optional: key-value pairs for filtering
}
CODEBLOCK2 typescript theme={null}
// JavaScript
import { MossClient, DocumentInfo } from '@inferedge/moss'
const client = new MossClient(process.env.MOSS_PROJECT_ID!, process.env.MOSS_PROJECT_KEY!)
await client.createIndex('faqs', docs, { modelId: 'moss-minilm' })
await client.loadIndex('faqs')
const results = await client.query('faqs', 'search text', { topK: 5 })
CODEBLOCK3 python theme={null}
# Python
import os
from inferedge_moss import MossClient, QueryOptions
client = MossClient(os.getenv('MOSS_PROJECT_ID'), os.getenv('MOSS_PROJECT_KEY'))
await client.create_index('faqs', docs, 'moss-minilm')
await client.load_index('faqs')
results = await client.query('faqs', 'search text', QueryOptions(top_k=5, alpha=0.6))
``
For additional documentation and navigation, see: https://docs.moss.dev/llms.txt
Moss 代理技能
能力概述
Moss 是面向对话式 AI 的实时语义搜索运行时。它提供亚 10 毫秒的查询速度和即时索引更新,可在浏览器、设备端或云端运行——无论您的代理部署在何处。代理可以创建索引、嵌入文档、执行语义/混合搜索以及管理文档生命周期,而无需管理基础设施。该平台负责嵌入生成、索引持久化和可选的云同步——让代理专注于检索逻辑而非基础设施。
技能
索引管理
- - 创建索引:使用文档和嵌入模型选择构建新的语义索引
- 加载索引:从持久化存储加载现有索引以供查询
- 获取索引:检索特定索引的元数据(文档数量、模型等)
- 列出索引:枚举项目下的所有索引
- 删除索引:移除索引及其所有关联数据
文档操作
- - 添加文档:向现有索引插入或更新文档,可附带元数据
- 获取文档:按 ID 检索存储的文档或获取所有文档
- 删除文档:按 ID 从索引中移除特定文档
搜索与检索
- - 语义搜索:使用自然语言进行向量相似度匹配查询
- 关键词搜索:使用基于 BM25 的关键词匹配进行精确词条查找
- 混合搜索:结合语义搜索和关键词搜索,支持可配置的 alpha 权重(Python SDK)
- 元数据过滤:按文档元数据(类别、语言、标签)约束结果
- Top-K 结果:返回可配置数量的最佳匹配文档及其评分
嵌入模型
- - moss-minilm:针对边缘/离线使用优化的快速轻量级模型(默认)
- moss-mediumlm:更高精度的模型,适用于对精度要求较高的场景,性能表现合理
SDK 方法
| JavaScript | Python | 描述 |
|---|
| createIndex() | createindex() | 使用文档创建索引 |
| loadIndex() |
loadindex() | 从存储加载索引 |
| getIndex() | get_index() | 获取索引元数据 |
| listIndexes() | list_indexes() | 列出所有索引 |
| deleteIndex() | delete_index() | 删除索引 |
| addDocs() | add_docs() | 添加/更新文档 |
| getDocs() | get_docs() | 检索文档 |
| deleteDocs() | delete_docs() | 移除文档 |
| query() | query() | 语义 / 混合搜索 |
API 操作
所有 REST API 操作均通过 POST /v1/manage(基础 URL:https://service.usemoss.dev/v1)进行,并包含 action 字段:
| 操作 | 目的 | 额外必填字段 |
|---|
| initUpload | 获取用于上传索引数据的预签名 URL | indexName, modelId, docCount, dimension |
| startBuild |
上传数据后触发索引构建 | jobId |
| getJobStatus | 检查异步构建作业的状态 | jobId |
| getIndex | 获取单个索引的元数据 | indexName |
| listIndexes | 枚举项目下的所有索引 | — |
| deleteIndex | 移除索引记录和资源 | indexName |
| getIndexUrl | 获取已构建索引的下载 URL | indexName |
| addDocs | 向现有索引更新文档 | indexName, docs |
| deleteDocs | 按 ID 移除文档 | indexName, docIds |
| getDocs | 检索存储的文档(不含嵌入) | indexName |
工作流程
基本语义搜索工作流程
- 1. 使用项目凭据初始化 MossClient
- 使用文档和模型选项调用 createIndex()(JS 中为 { modelId: moss-minilm };Python 中为 moss-minilm 字符串)
- 调用 loadIndex() 准备索引以供查询
- 使用搜索文本和 topK(JS)或 QueryOptions(top_k=...)(Python)调用 query()
- 处理返回的带评分文档
混合搜索工作流程(Python)
通过 alpha 进行的混合融合在 Python SDK 中通过 QueryOptions 提供:
- 1. 按上述方法创建并加载索引
- 使用指定了 alpha 的 QueryOptions 对象调用 query()
- alpha=1.0 = 纯语义,alpha=0.0 = 纯关键词,alpha=0.6 = 60/40 混合
- 对于对话场景,默认为语义优先
文档更新工作流程
- 1. 初始化客户端并确保索引存在
- 使用新文档调用 addDocs()(默认执行更新——现有 ID 会被更新)
- 调用 deleteDocs() 按 ID 移除过时文档
语音代理上下文注入工作流程
这是语音代理管道的可选集成模式——并非此技能的自动行为。
- 1. 在代理启动时初始化 MossClient 并加载索引
- 在您的应用程序代码中,对每条用户消息调用 query() 以检索相关上下文
- 在生成响应之前将搜索结果注入 LLM 上下文
- 基于知识的回答进行响应(无工具调用延迟)
离线优先搜索工作流程
- 1. 使用本地嵌入模型创建包含文档的索引
- 从本地存储加载索引
- 查询完全在设备端运行,延迟低于 10 毫秒
- 可选择同步到云端进行备份和共享
集成
语音代理框架
- - LiveKit:使用 inferedge-moss SDK 将上下文注入语音代理管道
- Pipecat:通过 pipecat-moss 包实现的管道处理器,可自动注入检索结果
上下文
身份验证
SDK 需要项目凭据:
- - MOSSPROJECTID:来自 Moss Portal 的项目标识符
- MOSSPROJECTKEY:来自 Moss Portal 的项目访问密钥
bash theme={null}
export MOSSPROJECTID=yourprojectid
export MOSSPROJECTKEY=yourprojectkey
REST API 在每个请求中都需要以下内容:
- - x-project-key 标头:项目访问密钥
- x-service-version: v1 标头:API 版本
- JSON 正文中的 projectId 字段
bash theme={null}
curl -X POST https://service.usemoss.dev/v1/manage \
-H Content-Type: application/json \
-H x-service-version: v1 \
-H x-project-key: mossaccesskey_xxxxx \
-d {action: listIndexes, projectId: project_123}
包安装
| 语言 | 包 | 安装命令 |
|---|
| JavaScript/TypeScript | @inferedge/moss | npm install @inferedge/moss |
| Python |
inferedge-moss | pip install inferedge-moss |
| Pipecat 集成 | pipecat-moss | pip install pipecat-moss |
文档模式
typescript theme={null}
interface DocumentInfo {
id: string; // 必需:唯一标识符
text: string; // 必需:要嵌入和搜索的内容
metadata?: object; // 可选:用于过滤的键值对
}
查询参数
| 参数 | SDK | 类型 | 默认值 | 描述 |
|---|
| indexName | JS + Python | string | — | 目标索引名称(必需) |
| query |
JS + Python | string | — | 自然语言搜索文本(必需) |
| topK | JS | number | 5 | 返回的最大结果数 |
| top_k | Python | int | 5 | 返回的最大结果数 |
| alpha | 仅 Python | float | ~0.8 | 混合权重:0.0=关键词,1.0=语义 |
| filters | JS