Nutrigenomics — Personalised Nutrition from Genetic Data

Skill ID: nutrigenomics
Version: 0.3.1
Status: Beta
Author: David de Lorenzo
Requires: Python 3.11+, pandas, numpy, matplotlib, seaborn, reportlab (optional)

What This Skill Does

The Nutrigenomics generates a personalised nutrition report from consumer
genetic data (23andMe, AncestryDNA raw files or VCF). It interrogates a curated
set of nutritionally-relevant SNPs drawn from GWAS Catalog, ClinVar, and
peer-reviewed nutrigenomics literature, then translates genotype calls into
actionable dietary and supplementation guidance — all computed locally.

Key outputs

- Markdown nutrition report with risk scores and per-SNP genotype calls
Radar chart of nutrient risk profile
Gene × nutrient heatmap
Reproducibility bundle (README_reproducibility.txt, environment.yml, checksums.txt, provenance.json)

Trigger Phrases

The Bio Orchestrator should route to this skill when the user says anything like:

- "personalised nutrition", "nutrigenomics", "diet genetics"
"what should I eat based on my DNA"
"nutrient metabolism", "vitamin absorption genetics"
"MTHFR", "APOE", "FTO", "BCMO1", "VDR", "FADS1/2"
"folate", "omega-3", "vitamin D", "caffeine metabolism", "lactose", "gluten"
Input files: .txt or .csv (23andMe), .csv (AncestryDNA), INLINECODE8

Curated SNP Panel

Macronutrient Metabolism

Gene	SNP	Nutrient Impact	Evidence
FTO	rs9939609	Energy balance, fat mass, carb sensitivity	Strong (GWAS)
PPARG

Micronutrient Metabolism

Gene	SNP	Nutrient	Effect of risk allele
MTHFR	rs1801133	Folate / B12	↓ 5-MTHF conversion (~70%)
MTHFR

Omega-3 / Fatty Acid Metabolism

Gene	SNP	Nutrient	Effect
FADS1	rs174546	LC-PUFA synthesis	↑/↓ EPA/DHA from ALA
FADS2

Caffeine & Alcohol

Gene	SNP	Compound	Effect
CYP1A2	rs762551	Caffeine	Slow/Fast metaboliser
AHR

Food Sensitivities

Gene	SNP	Sensitivity	Effect
MCM6	rs4988235	Lactose intolerance	Non-persistence of lactase
HLA-DQ2

Proxy SNPs | Coeliac / gluten | HLA-DQA1/DQB1 risk haplotypes |

Antioxidant & Detoxification

Gene	SNP	Pathway	Effect
SOD2	rs4880	Manganese SOD	↓ mitochondrial antioxidant
GPX1

Algorithm

1. Input Parsing (`parse_input.py`)

Accepts:

- 23andMe .txt or .csv (tab-separated: rsid, chromosome, position, genotype)
AncestryDNA INLINECODE12
Standard VCF (extracts GT field)

Auto-detects format from header lines. Normalises alleles to forward strand using
a hard-coded reference table (avoids requiring external databases).

2. Genotype Extraction (`extract_genotypes.py`)

For each SNP in the panel:

1. Look up rsid in parsed data
Return genotype string (e.g. "AT", "TT", "AA")
Flag as "NOT_TESTED" if absent (common for chip-to-chip variation)

3. Risk Scoring (`score_variants.py`)

Each SNP is scored on a 0 / 0.5 / 1.0 scale:

- 0.0 — homozygous reference (lowest risk)
INLINECODE20 — heterozygous
INLINECODE21 — homozygous risk allele

Composite Nutrient Risk Scores (0–10) are computed per nutrient domain by
summing weighted SNP scores. Weights are derived from reported effect sizes
(beta coefficients or OR) in the primary literature.

Risk categories:

- 0–3: Low risk — standard dietary advice applies
3–6: Moderate risk — dietary optimisation recommended
6–10: Elevated risk — consider testing and targeted supplementation

Important caveat: These are polygenic risk indicators based on common
variants. They are not diagnostic. Rare pathogenic variants (e.g. MTHFR
compound heterozygosity with high homocysteine) require clinical confirmation.

4. Report Generation (`generate_report.py`)

Outputs a structured Markdown report with:

- Executive summary (top 3 personalised findings)
Per-nutrient sections: genotype table → interpretation → recommendation
Radar chart (matplotlib) of nutrient risk scores
Gene × nutrient heatmap (seaborn)
Supplement interactions table
Disclaimer section
Reproducibility block

5. Reproducibility Bundle (`repro_bundle.py`)

Exports to the output directory (not committed to the repo):

- README_reproducibility.txt — step-by-step instructions to reproduce the analysis manually
INLINECODE25 — pinned conda environment
INLINECODE26 — SHA-256 checksums of the SNP panel and output report (input file intentionally excluded to avoid persisting a fingerprint of genetic data)
INLINECODE27 — timestamp, version, and format arguments (input filename intentionally omitted)

Note: No executable scripts are generated. The reproducibility bundle contains
only text files for documentation and integrity verification.

Execution

To run the analysis on a user-provided genetic file, execute this command directly:

CODEBLOCK0

To run a demo without real genetic data (synthetic patient file included with the skill):

CODEBLOCK1

INLINECODE28 is replaced by OpenClaw at runtime with the absolute path to this skill's folder. Do not substitute it manually. Output is written to a timestamped directory (nutrigenomics_output_YYYYMMDD_HHMMSS/) in the current working directory and persists until manually deleted.

Supported --format values: auto (default), 23andme, ancestry, vcf.

Usage

CODEBLOCK2

File Structure

CODEBLOCK3

Note: Runtime output directories and randomly generated patient files are
excluded from version control. Only the pre-rendered demo
report in examples/output/ is committed.

Privacy

All computation runs locally — no genetic data is ever transmitted to external
servers or third-party services.

What the report contains: The Markdown report includes per-SNP genotype calls
(e.g. AT, TT) for each of the 58 panel SNPs analysed. This is intentional:
knowing your specific genotype at each nutrition-related locus is what makes the
report actionable. Full raw genome data from the input file is not reproduced in
the report; only the 58 panel SNPs are included.

File persistence: Output files (report, figures, reproducibility bundle) are
written to a timestamped nutrigenomics_output_YYYYMMDD_HHMMSS/ directory under
the working directory and persist on disk until manually deleted. The input
file is read-only and is never copied into the output directory.

If you are running this skill on behalf of others or in a shared environment,
delete the output directory once the user has downloaded their results.

Limitations & Disclaimer

1. Not a medical device. This skill provides educational, research-oriented

nutrigenomics analysis. It does not constitute medical advice.

2. Common variants only. The panel covers SNPs with MAF > 1% in at least one

major population. Rare pathogenic variants are out of scope.

3. Population context. Effect sizes are predominantly derived from European

GWAS cohorts. Risk estimates may not generalise equally across all ancestries.

4. Gene–environment interaction. Genetic risk scores interact with baseline

diet, lifestyle, microbiome, and epigenetic state. A "high risk" score does not mean a nutrient deficiency is present — it means the individual may benefit from monitoring.

5. Simpson's Paradox note. Population-level associations used to derive weights

may not reflect individual trajectories (see Corpas 2025, *Nutrigenomics and the Ecological Fallacy*).

Roadmap

- [ ] v0.2: Microbiome × genotype interaction module (16S rRNA input)
[ ] v0.3: Longitudinal tracking — compare reports across time
[ ] v0.4: HLA typing for immune-mediated food reactions (coeliac, gluten sensitivity)
[ ] v1.0: Multi-omics integration (metabolomics + genomics + dietary recall)

References

This skill's SNP panel and methodology are informed by peer-reviewed nutrigenomics research. For verification and additional details, consult:

- PubMed MEDLINE: https://pubmed.ncbi.nlm.nih.gov/
GWAS Catalog: https://www.ebi.ac.uk/gwas/ (published genome-wide association studies)
ClinVar: https://www.ncbi.nlm.nih.gov/clinvar/ (variant interpretations)

Users are encouraged to verify specific claims through these authoritative sources and with qualified healthcare providers.

Contributing

The SNP panel (data/snp_panel.json) is maintained by the skill author.
To suggest additions or corrections, contact David de Lorenzo directly via
GitHub (@drdaviddelorenzo) or open
an issue on GitHub.

营养基因组学——基于遗传数据的个性化营养方案

技能ID：nutrigenomics
版本：0.3.1
状态：测试版
作者：David de Lorenzo
运行要求：Python 3.11+、pandas、numpy、matplotlib、seaborn、reportlab（可选）

技能功能概述

营养基因组学技能可根据消费者的基因数据（23andMe、AncestryDNA原始文件或VCF格式）生成个性化营养报告。该技能检索来自GWAS目录、ClinVar数据库及同行评审营养基因组学文献中经过筛选的营养相关SNP集合，将基因型结果转化为可执行的饮食与补充剂指导建议——所有计算均在本地完成。

主要输出内容

- Markdown格式营养报告，包含风险评分及每个SNP的基因型结果
营养素风险概况雷达图
基因×营养素热力图
可复现性数据包（README_reproducibility.txt、environment.yml、checksums.txt、provenance.json）

触发短语

当用户提及以下内容时，生物编排器应调用此技能：

- 个性化营养、营养基因组学、饮食遗传学
根据我的DNA应该吃什么
营养素代谢、维生素吸收遗传学
MTHFR、APOE、FTO、BCMO1、VDR、FADS1/2
叶酸、欧米伽-3、维生素D、咖啡因代谢、乳糖、麸质
输入文件：.txt或.csv（23andMe格式）、.csv（AncestryDNA格式）、.vcf

精选SNP检测组

宏量营养素代谢

基因	SNP	营养素影响	证据等级
FTO	rs9939609	能量平衡、脂肪量、碳水化合物敏感性	强（GWAS）
PPARG

rs1801282 | 脂肪代谢、胰岛素敏感性 | 中等 | | APOA5 | rs662799 | 膳食脂肪对甘油三酯的反应 | 强 | | TCF7L2 | rs7903146 | 碳水化合物代谢、2型糖尿病风险 | 强 | | ADRB2 | rs1042713 | 脂肪氧化、运动×饮食交互作用 | 中等 |

微量营养素代谢

基因	SNP	营养素	风险等位基因效应
MTHFR	rs1801133	叶酸/维生素B12	5-MTHF转化率降低约70%
MTHFR

欧米伽-3/脂肪酸代谢

基因	SNP	营养素	效应
FADS1	rs174546	长链多不饱和脂肪酸合成	α-亚麻酸转化为EPA/DHA能力升高/降低
FADS2

咖啡因与酒精

基因	SNP	化合物	效应
CYP1A2	rs762551	咖啡因	慢速/快速代谢者
AHR

rs4410790 | 咖啡因 | 调节CYP1A2诱导 | | ADH1B | rs1229984 | 酒精 | 乙醛蓄积风险 | | ALDH2 | rs671 | 酒精 | 亚洲脸红/毒性风险 |

食物敏感性

基因	SNP	敏感性	效应
MCM6	rs4988235	乳糖不耐受	乳糖酶持续性缺失
HLA-DQ2

代理SNP | 乳糜泻/麸质 | HLA-DQA1/DQB1风险单倍型 |

抗氧化与解毒

基因	SNP	通路	效应
SOD2	rs4880	锰超氧化物歧化酶	线粒体抗氧化能力降低
GPX1

算法流程

1. 输入解析（parse_input.py）

接受以下格式：

- 23andMe .txt或.csv（制表符分隔：rsid、染色体、位置、基因型）
AncestryDNA .csv
标准VCF（提取GT字段）

根据文件头行自动检测格式。使用硬编码参考表将等位基因归一化为正向链（无需外部数据库）。

2. 基因型提取（extract_genotypes.py）

对检测组中的每个SNP：

1. 在解析数据中查找rsid
返回基因型字符串（例如AT、TT、AA）
若缺失则标记为NOT_TESTED（常见于不同芯片间的差异）

3. 风险评分（score_variants.py）

每个SNP按0/0.5/1.0等级评分：

- 0.0 — 纯合参考型（风险最低）
0.5 — 杂合型
1.0 — 纯合风险等位基因

综合营养素风险评分（0-10分）通过对每个营养素领域的加权SNP评分求和计算。权重来源于原始文献中报告的效果量（β系数或比值比）。

风险分类：

- 0-3分：低风险——适用标准饮食建议
3-6分：中等风险——建议优化饮食
6-10分：高风险——考虑检测和针对性补充

重要提示：这些是基于常见变异的多基因风险指标，不具诊断意义。罕见致病性变异（例如MTHFR复合杂合伴高同型半胱氨酸）需要临床确认。

4. 报告生成（generate_report.py）

输出结构化的Markdown报告，包含：

- 执行摘要（前3项个性化发现）
各营养素板块：基因型表格→解读→建议
营养素风险评分雷达图（matplotlib）
基因×营养素热力图（seaborn）
补充剂交互作用表
免责声明板块
可复现性数据块

5. 可复现性数据包（repro_bundle.py）

导出至输出目录（不提交至仓库）：

- README_re

nutrigenomics营养基因组

nutrigenomics

Nutrigenomics — Personalised Nutrition from Genetic Data

What This Skill Does

Trigger Phrases

Curated SNP Panel

Macronutrient Metabolism

Micronutrient Metabolism

Omega-3 / Fatty Acid Metabolism

Caffeine & Alcohol

Food Sensitivities

Antioxidant & Detoxification

Algorithm

1. Input Parsing (parse_input.py)

2. Genotype Extraction (extract_genotypes.py)

3. Risk Scoring (score_variants.py)

4. Report Generation (generate_report.py)

5. Reproducibility Bundle (repro_bundle.py)

Execution

Usage

File Structure

Privacy

Limitations & Disclaimer

Roadmap

References

Contributing

营养基因组学——基于遗传数据的个性化营养方案

技能功能概述

触发短语

精选SNP检测组

宏量营养素代谢

微量营养素代谢

欧米伽-3/脂肪酸代谢

咖啡因与酒精

食物敏感性

抗氧化与解毒

算法流程

1. 输入解析（parse_input.py）

2. 基因型提取（extract_genotypes.py）

3. 风险评分（score_variants.py）

4. 报告生成（generate_report.py）

5. 可复现性数据包（repro_bundle.py）

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement

1. Input Parsing (`parse_input.py`)

2. Genotype Extraction (`extract_genotypes.py`)

3. Risk Scoring (`score_variants.py`)

4. Report Generation (`generate_report.py`)

5. Reproducibility Bundle (`repro_bundle.py`)