Parametric MMD Estimation with Missing Values: Robustness to Missingness and Data Model Misspecification

Chérief-Abdellatif, Badr-Eddine; Näf, Jeffrey

统计学 > 方法论

arXiv:2503.00448 (stat)

[提交于 2025年3月1日 ]

标题：带有缺失值的参数MMD估计：对缺失性和数据模型误指定的鲁棒性

标题： Parametric MMD Estimation with Missing Values: Robustness to Missingness and Data Model Misspecification

Authors:Badr-Eddine Chérief-Abdellatif, Jeffrey Näf

摘要：在缺失数据文献中，最大似然估计量（MLE）因其在随机缺失（MAR）数据下具有可忽略性而备受推崇。然而，即使在MAR假设下，它对（完全）数据模型的误设仍然存在显著局限性。此外，MAR假设并不总是现实的，这通过缺失机制引入了潜在误设的额外来源，使该问题更加复杂。为了解决这些问题，我们提出了一种基于最大均值差异（MMD）的新颖M估计方法，该方法在理论上被证明对模型误设和假设缺失机制的偏差都具有鲁棒性。我们的方法在复杂情况下提供了强有力理论保证和更高的可靠性。我们在完全随机缺失（MCAR）情况下建立了估计量的一致性和渐近正态性，提出了一个高效的随机梯度下降算法，并推导了误差界，明确分离了模型误设和缺失偏差的贡献。此外，我们分析了非随机缺失（MNAR）场景，在这些场景中，我们的估计量保持了可控的误差，包括缺失机制和数据模型都被污染的情况。我们的贡献细化了对MLE局限性的理解，并提供了一个处理缺失数据的稳健且有原则的替代方案。

摘要： In the missing data literature, the Maximum Likelihood Estimator (MLE) is celebrated for its ignorability property under missing at random (MAR) data. However, its sensitivity to misspecification of the (complete) data model, even under MAR, remains a significant limitation. This issue is further exacerbated by the fact that the MAR assumption may not always be realistic, introducing an additional source of potential misspecification through the missingness mechanism. To address this, we propose a novel M-estimation procedure based on the Maximum Mean Discrepancy (MMD), which is provably robust to both model misspecification and deviations from the assumed missingness mechanism. Our approach offers strong theoretical guarantees and improved reliability in complex settings. We establish the consistency and asymptotic normality of the estimator under missingness completely at random (MCAR), provide an efficient stochastic gradient descent algorithm, and derive error bounds that explicitly separate the contributions of model misspecification and missingness bias. Furthermore, we analyze missing not at random (MNAR) scenarios where our estimator maintains controlled error, including a Huber setting where both the missingness mechanism and the data model are contaminated. Our contributions refine the understanding of the limitations of the MLE and provide a robust and principled alternative for handling missing data.

主题：	方法论 (stat.ME)
引用方式：	arXiv:2503.00448 [stat.ME]
	(或者 arXiv:2503.00448v1 [stat.ME] 对于此版本)
	https://doi.org/10.48550/arXiv.2503.00448

提交历史

来自： Jeffrey Näf [查看电子邮件]
[v1] 星期六， 2025 年 3 月 1 日 10:59:17 UTC (802 KB)

统计学 > 方法论

标题：带有缺失值的参数MMD估计：对缺失性和数据模型误指定的鲁棒性

标题： Parametric MMD Estimation with Missing Values: Robustness to Missingness and Data Model Misspecification

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 方法论

标题： 带有缺失值的参数MMD估计：对缺失性和数据模型误指定的鲁棒性 显示英文标题

标题： Parametric MMD Estimation with Missing Values: Robustness to Missingness and Data Model Misspecification

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：带有缺失值的参数MMD估计：对缺失性和数据模型误指定的鲁棒性