Intern PubChem Name Conversion
Convert one molecular representation into all three fields:
- - INLINECODE0
- INLINECODE1
- INLINECODE2
When to use
Use this skill when the user asks to:
- - convert IUPAC <-> SMILES
- fetch molecular formula from IUPAC/SMILES
- validate molecule identity against PubChem
Do not use this skill for:
- - reaction mechanism explanation
- quantum chemistry simulation
- docking or property prediction beyond PubChem identifiers
Input contract
Expect one input value and one type:
- -
input_type: iupac or INLINECODE5 - INLINECODE6 : raw string
If the user gives only one string without type:
- - treat strings with many bond symbols (
=, #, [, ], @) as INLINECODE12 - otherwise treat as
iupac/name query
Required behavior
Always query PubChem first. Do not answer from memory when tools are available.
1) URL-encode the full input string:
CODEBLOCK0
2) Build the primary endpoint:
- - If
input_type == iupac:
-
https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/{ENCODED}/property/SMILES,IUPACName,MolecularFormula/JSON
- - If
input_type == smiles:
- INLINECODE17
3) If smiles primary endpoint is non-200, retry once with:
4) If still non-200, do CID fallback:
- iupac:
.../compound/name/{ENCODED}/cids/JSON
- smiles:
.../compound/smiles/{ENCODED}/cids/JSON
- - Then fetch properties by CID:
- INLINECODE22
5) Parse PropertyTable.Properties[0] and map:
- -
smiles <- SMILES (fallback ConnectivitySMILES) - INLINECODE26
- INLINECODE27
Output format
Return JSON only (no markdown fences, no extra prose):
CODEBLOCK1
If all attempts fail, still return the same schema with empty strings:
CODEBLOCK2
Quality rules
- - Keep PubChem values verbatim; do not rewrite or normalize names.
- If multiple records are returned, use the first record consistently.
- Do not silently swap stereochemistry markers.
Intern PubChem 名称转换
将一种分子表示形式转换为全部三个字段:
使用时机
当用户要求以下操作时使用此技能:
- - 转换 IUPAC <-> SMILES
- 从 IUPAC/SMILES 获取分子式
- 验证分子标识与 PubChem 的一致性
以下情况请勿使用此技能:
- - 反应机理解释
- 量子化学模拟
- 超出 PubChem 标识符范围的对接或属性预测
输入约定
期望一个输入值和一个类型:
- - inputtype:iupac 或 smiles
- inputvalue:原始字符串
如果用户仅提供一个字符串而未指定类型:
- - 将包含多个键符号(=、#、[、]、@)的字符串视为 smiles
- 否则视为 iupac/名称查询
必需行为
始终优先查询 PubChem。当工具可用时,不要凭记忆回答。
1) 对完整输入字符串进行 URL 编码:
bash
ENCODED=$(python3 -c import urllib.parse,sys; print(urllib.parse.quote(sys.argv[1], safe=)) $INPUT_VALUE)
2) 构建主端点:
- - 如果 input_type == iupac:
- https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/{ENCODED}/property/SMILES,IUPACName,MolecularFormula/JSON
- - 如果 input_type == smiles:
- https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/smiles/{ENCODED}/property/SMILES,IUPACName,MolecularFormula/JSON
3) 如果 smiles 主端点返回非 200 状态码,使用以下端点重试一次:
- - https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/fastidentity/smiles/{ENCODED}/property/SMILES,IUPACName,MolecularFormula/JSON
4) 如果仍然非 200,执行 CID 回退:
- iupac:.../compound/name/{ENCODED}/cids/JSON
- smiles:.../compound/smiles/{ENCODED}/cids/JSON
- .../compound/cid/{CID}/property/SMILES,IUPACName,MolecularFormula/JSON
5) 解析 PropertyTable.Properties[0] 并映射:
- - smiles <- SMILES(回退 ConnectivitySMILES)
- iupac <- IUPACName
- formula <- MolecularFormula
输出格式
仅返回 JSON(无 markdown 代码块标记,无额外说明):
json
{
smiles: ...,
iupac: ...,
formula: ...
}
如果所有尝试均失败,仍返回相同结构的空字符串:
json
{
smiles: ,
iupac: ,
formula:
}
质量规则
- - 保持 PubChem 值原样;不要重写或规范化名称。
- 如果返回多条记录,始终使用第一条记录。
- 不要静默交换立体化学标记。