Enabling Down Syndrome Research through a Knowledge Graph-Driven Analytical Framework

Krishnamurthy, Madan; Saha, Surya; Lo, Pierrette; Whetzel, Patricia L.; Issabekova, Tursynay; Vargas, Jamed Ferreris; DiGiovanna, Jack; Haendel, Melissa A

Quantitative Biology > Quantitative Methods

arXiv:2509.01565 (q-bio)

[Submitted on 1 Sep 2025 ]

Title: Enabling Down Syndrome Research through a Knowledge Graph-Driven Analytical Framework

Title: 通过知识图谱驱动的分析框架促进唐氏综合征研究

Authors:Madan Krishnamurthy, Surya Saha, Pierrette Lo, Patricia L. Whetzel, Tursynay Issabekova, Jamed Ferreris Vargas, Jack DiGiovanna, Melissa A Haendel

Abstract: Trisomy 21 results in Down syndrome, a multifaceted genetic disorder with diverse clinical phenotypes, including heart defects, immune dysfunction, neurodevelopmental differences, and early-onset dementia risk. Heterogeneity and fragmented data across studies challenge comprehensive research and translational discovery. The NIH INCLUDE (INvestigation of Co-occurring conditions across the Lifespan to Understand Down syndromE) initiative has assembled harmonized participant-level datasets, yet realizing their potential requires integrative analytical frameworks. We developed a knowledge graph-driven platform transforming nine INCLUDE studies, comprising 7,148 participants, 456 conditions, 501 phenotypes, and over 37,000 biospecimens, into a unified semantic infrastructure. Cross-resource enrichment with Monarch Initiative data expands coverage to 4,281 genes and 7,077 variants. The resulting knowledge graph contains over 1.6 million semantic associations, enabling AI-ready analysis with graph embeddings and path-based reasoning for hypothesis generation. Researchers can query the graph via SPARQL or natural language interfaces. This framework converts static data repositories into dynamic discovery environments, supporting cross-study pattern recognition, predictive modeling, and systematic exploration of genotype-phenotype relationships in Down syndrome.

Abstract: 21三体综合征导致唐氏综合症，这是一种具有多种临床表型的多方面遗传疾病，包括心脏缺陷、免疫功能障碍、神经发育差异以及早发性痴呆风险。各研究之间的异质性和碎片化数据给全面研究和转化发现带来了挑战。美国国立卫生研究院（NIH）的INCLUDE（跨生命期共存状况研究以理解唐氏综合症）项目已整合了统一的参与者层面数据集，但要实现其潜力需要整合分析框架。我们开发了一个基于知识图谱的平台，将九项INCLUDE研究转换为统一的语义基础设施，这些研究包括7,148名参与者、456种状况、501种表型以及超过37,000份生物样本。与Monarch倡议数据进行跨资源增强，使覆盖范围扩展到4,281个基因和7,077个变异。由此产生的知识图谱包含超过160万条语义关联，能够通过图嵌入和基于路径的推理进行AI就绪分析，以生成假设。研究人员可以通过SPARQL或自然语言接口查询该图谱。该框架将静态数据存储库转化为动态发现环境，支持跨研究模式识别、预测建模以及对唐氏综合症基因型-表型关系的系统探索。

Subjects:	Quantitative Methods (q-bio.QM) ; Artificial Intelligence (cs.AI); Databases (cs.DB); Machine Learning (cs.LG)
Cite as:	arXiv:2509.01565 [q-bio.QM]
	(or arXiv:2509.01565v1 [q-bio.QM] for this version)
	https://doi.org/10.48550/arXiv.2509.01565

Submission history

From: Madan Krishnamurthy [view email]
[v1] Mon, 1 Sep 2025 15:50:38 UTC (2,275 KB)

Quantitative Biology > Quantitative Methods

Title: Enabling Down Syndrome Research through a Knowledge Graph-Driven Analytical Framework

Title: 通过知识图谱驱动的分析框架促进唐氏综合征研究

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Biology > Quantitative Methods

Title: Enabling Down Syndrome Research through a Knowledge Graph-Driven Analytical Framework Show Chinese title

Title: 通过知识图谱驱动的分析框架促进唐氏综合征研究

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Enabling Down Syndrome Research through a Knowledge Graph-Driven Analytical Framework