Gene Hunting with Knockoffs for Hidden Markov Models

Sesia, Matteo; Sabatti, Chiara; Candès, Emmanuel J.

doi:10.1093/biomet/asy033

统计学 > 方法论

arXiv:1706.04677 (stat)

[提交于 2017年6月14日 ]

标题：基于隐马尔可夫模型的敲除法基因搜索

标题： Gene Hunting with Knockoffs for Hidden Markov Models

Authors:Matteo Sesia, Chiara Sabatti, Emmanuel J. Candès

摘要：现代科学研究常常需要确定一组相关的解释变量子集，以试图理解一个有趣的现象。已经开发出几种统计方法来自动化这项任务，但直到最近，模型无关的敲击法框架提出了一种通用的解决方案，可以在严格的 I 型错误控制下执行变量选择，而不依赖于强建模假设。本文中，我们将模型无关的敲击法方法扩展到一类丰富的问题，其中协变量的分布可以用隐藏马尔可夫模型（HMM）描述。我们开发了一个精确且高效的算法来生成 HMM 的敲击副本。然后我们论证了结合敲击选择框架后，它们为基因组范围关联研究中的推断提供了一个自然且强大的工具，并保证了 FDR 控制。最后，我们将我们的方法应用于几个旨在研究克罗恩病和几种连续表型（如胆固醇水平）的数据集。

摘要： Modern scientific studies often require the identification of a subset of relevant explanatory variables, in the attempt to understand an interesting phenomenon. Several statistical methods have been developed to automate this task, but only recently has the framework of model-free knockoffs proposed a general solution that can perform variable selection under rigorous type-I error control, without relying on strong modeling assumptions. In this paper, we extend the methodology of model-free knockoffs to a rich family of problems where the distribution of the covariates can be described by a hidden Markov model (HMM). We develop an exact and efficient algorithm to sample knockoff copies of an HMM. We then argue that combined with the knockoffs selective framework, they provide a natural and powerful tool for performing principled inference in genome-wide association studies with guaranteed FDR control. Finally, we apply our methodology to several datasets aimed at studying the Crohn's disease and several continuous phenotypes, e.g. levels of cholesterol.

评论：	35页，13幅图，9张表格
主题：	方法论 (stat.ME) ; 统计理论 (math.ST); 应用 (stat.AP)
引用方式：	arXiv:1706.04677 [stat.ME]
	(或者 arXiv:1706.04677v1 [stat.ME] 对于此版本)
	https://doi.org/10.48550/arXiv.1706.04677
期刊参考：	Biometrika, Volume 106, Issue 1, 1 March 2019, Pages 1-18
相关 DOI:	https://doi.org/10.1093/biomet/asy033

提交历史

来自： Emmanuel Candes [查看电子邮件]
[v1] 星期三， 2017 年 6 月 14 日 21:42:12 UTC (320 KB)

统计学 > 方法论

标题：基于隐马尔可夫模型的敲除法基因搜索

标题： Gene Hunting with Knockoffs for Hidden Markov Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 方法论

标题： 基于隐马尔可夫模型的敲除法基因搜索 显示英文标题

标题： Gene Hunting with Knockoffs for Hidden Markov Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：基于隐马尔可夫模型的敲除法基因搜索