MATES: Multi-view Aggregated Two-Sample Test

Cai, Zexi; Fei, Wenbo; Zhou, Doudou

统计学 > 方法论

arXiv:2412.16684 (stat)

[提交于 2024年12月21日 ]

标题：多视图聚合两样本检验 (MATES)

标题： MATES: Multi-view Aggregated Two-Sample Test

Authors:Zexi Cai, Wenbo Fei, Doudou Zhou

摘要：两样本检验是统计学中的一个基础问题，具有广泛的应用。在高维数据领域，由于非参数方法的灵活性和对分布假设的最少要求，这些方法受到了广泛关注。然而，许多现有的方法在两个分布主要在前两阶矩上存在差异时表现更有效。在许多现实场景中，分布差异可能出现在更高阶矩上，导致传统方法的效力降低。为了解决这一局限性，我们提出了一种新颖的框架，从多个矩中聚合信息以构建检验统计量。每个矩被视为数据的一个视角，并有助于检测某些特定类型的差异，从而使检验统计量能够捕捉更复杂的分布差异。这种新颖的多视图聚合两样本检验（MATES）采用基于图的方法，其中检验统计量由合并样本的加权相似性图构造。在多视图加权相似性图的温和条件下，我们建立了MATES的理论性质，包括在零假设下的无分布限制分布，这使得I型错误控制变得简单直接。大量的模拟研究表明，MATES能够有效区分分布之间的微妙差异。我们进一步在S&P100数据上验证了该方法，展示了其在检测复杂分布变化方面的强大能力。

摘要： The two-sample test is a fundamental problem in statistics with a wide range of applications. In the realm of high-dimensional data, nonparametric methods have gained prominence due to their flexibility and minimal distributional assumptions. However, many existing methods tend to be more effective when the two distributions differ primarily in their first and/or second moments. In many real-world scenarios, distributional differences may arise in higher-order moments, rendering traditional methods less powerful. To address this limitation, we propose a novel framework to aggregate information from multiple moments to build a test statistic. Each moment is regarded as one view of the data and contributes to the detection of some specific type of discrepancy, thus allowing the test statistic to capture more complex distributional differences. The novel multi-view aggregated two-sample test (MATES) leverages a graph-based approach, where the test statistic is constructed from the weighted similarity graphs of the pooled sample. Under mild conditions on the multi-view weighted similarity graphs, we establish theoretical properties of MATES, including a distribution-free limiting distribution under the null hypothesis, which enables straightforward type-I error control. Extensive simulation studies demonstrate that MATES effectively distinguishes subtle differences between distributions. We further validate the method on the S&P100 data, showcasing its power in detecting complex distributional variations.

主题：	方法论 (stat.ME) ; 统计理论 (math.ST)
引用方式：	arXiv:2412.16684 [stat.ME]
	(或者 arXiv:2412.16684v1 [stat.ME] 对于此版本)
	https://doi.org/10.48550/arXiv.2412.16684

提交历史

来自： Zexi Cai [查看电子邮件]
[v1] 星期六， 2024 年 12 月 21 日 16:19:06 UTC (111 KB)

统计学 > 方法论

标题：多视图聚合两样本检验 (MATES)

标题： MATES: Multi-view Aggregated Two-Sample Test

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 方法论

标题： 多视图聚合两样本检验 (MATES) 显示英文标题

标题： MATES: Multi-view Aggregated Two-Sample Test

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：多视图聚合两样本检验 (MATES)