A Model for Paired-Multinomial Data and Its Application to Analysis of Data on a Taxonomic Tree

Shi, Pixu; Li, Hongzhe

统计学 > 应用

arXiv:1702.04808 (stat)

[提交于 2017年2月15日 ]

标题：一种成对多项数据模型及其在分类树数据上的应用

标题： A Model for Paired-Multinomial Data and Its Application to Analysis of Data on a Taxonomic Tree

Authors:Pixu Shi, Hongzhe Li

摘要：在人体微生物组研究中，测序读数数据通常被汇总为各种分类水平的细菌分类单元的计数，这些分类水平由分类树指定。本文考虑了从同一受试者处收集的微生物组数据的两次重复测量分析问题。此类数据通常用于评估某种治疗后的微生物组成变化，或不同身体部位之间的微生物组成差异。现有针对此类计数数据的模型在建模计数的协方差结构以及处理配对多项计数数据方面存在局限性。提出了一种适用于配对多项计数数据的新概率分布，该分布允许灵活的协方差结构，并可用于建模重复测量的多变量计数数据。基于此分布，开发了一个检验统计量，用于基于配对多项计数数据测试组成差异。所提出的检验可以应用于分类树上的计数数据，以测试微生物组组成的差异并识别具有不同子组成的子树。模拟结果表明，与一些常用方法相比，所提出的检验具有正确的第一类错误率和更高的功效。使用上呼吸道微生物组数据集的分析来说明所提出的方法。

摘要： In human microbiome studies, sequencing reads data are often summarized as counts of bacterial taxa at various taxonomic levels specified by a taxonomic tree. This paper considers the problem of analyzing two repeated measurements of microbiome data from the same subjects. Such data are often collected to assess the change of microbial composition after certain treatment, or the difference in microbial compositions across body sites. Existing models for such count data are limited in modeling the covariance structure of the counts and in handling paired multinomial count data. A new probability distribution is proposed for paired-multinomial count data, which allows flexible covariance structure and can be used to model repeatedly measured multivariate count data. Based on this distribution, a test statistic is developed for testing the difference in compositions based on paired multinomial count data. The proposed test can be applied to the count data observed on a taxonomic tree in order to test difference in microbiome compositions and to identify the subtrees with different subcompositions. Simulation results indicate that proposed test has correct type 1 errors and increased power compared to some commonly used methods. An analysis of an upper respiratory tract microbiome data set is used to illustrate the proposed methods.

主题：	应用 (stat.AP)
引用方式：	arXiv:1702.04808 [stat.AP]
	(或者 arXiv:1702.04808v1 [stat.AP] 对于此版本)
	https://doi.org/10.48550/arXiv.1702.04808

提交历史

来自： Pixu Shi [查看电子邮件]
[v1] 星期三， 2017 年 2 月 15 日 22:50:27 UTC (665 KB)

统计学 > 应用

标题：一种成对多项数据模型及其在分类树数据上的应用

标题： A Model for Paired-Multinomial Data and Its Application to Analysis of Data on a Taxonomic Tree

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 应用

标题： 一种成对多项数据模型及其在分类树数据上的应用 显示英文标题

标题： A Model for Paired-Multinomial Data and Its Application to Analysis of Data on a Taxonomic Tree

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：一种成对多项数据模型及其在分类树数据上的应用