Valid Inference for Machine Learning Model Parameters

Dey, Neil; Williams, Jonathan P.

统计学 > 机器学习

arXiv:2302.10840v2 (stat)

[提交于 2023年2月21日 (v1) ，最后修订 2024年5月9日 (此版本， v2)]

标题：机器学习模型参数的有效推断

标题： Valid Inference for Machine Learning Model Parameters

Authors:Neil Dey, Jonathan P. Williams

摘要：机器学习模型的参数通常是通过在一组训练数据上最小化损失函数来学习的。然而，这可能会带来过拟合的风险；为了使模型能够良好泛化，我们能够找到整个总体上的最优参数非常重要——而不仅仅是给定的训练样本上的最优参数。在本文中，我们构建了这种机器学习模型最优参数的有效置信集，这些置信集仅使用训练数据即可生成，而无需了解总体。然后我们表明，研究这个置信集的分布可以使我们为参数空间中的任意区域赋予一种置信度，并且我们证明了可以使用自助法技术很好地近似这种分布。

摘要： The parameters of a machine learning model are typically learned by minimizing a loss function on a set of training data. However, this can come with the risk of overtraining; in order for the model to generalize well, it is of great importance that we are able to find the optimal parameter for the model on the entire population -- not only on the given training sample. In this paper, we construct valid confidence sets for this optimal parameter of a machine learning model, which can be generated using only the training data without any knowledge of the population. We then show that studying the distribution of this confidence set allows us to assign a notion of confidence to arbitrary regions of the parameter space, and we demonstrate that this distribution can be well-approximated using bootstrapping techniques.

评论：	35页，6图
主题：	机器学习 (stat.ML) ; 机器学习 (cs.LG); 统计理论 (math.ST)
引用方式：	arXiv:2302.10840 [stat.ML]
	(或者 arXiv:2302.10840v2 [stat.ML] 对于此版本)
	https://doi.org/10.48550/arXiv.2302.10840

提交历史

来自： Neil Dey [查看电子邮件]
[v1] 星期二， 2023 年 2 月 21 日 17:46:08 UTC (119 KB)
[v2] 星期四， 2024 年 5 月 9 日 20:30:32 UTC (128 KB)

统计学 > 机器学习

标题：机器学习模型参数的有效推断

标题： Valid Inference for Machine Learning Model Parameters

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 机器学习

标题： 机器学习模型参数的有效推断 显示英文标题

标题： Valid Inference for Machine Learning Model Parameters

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：机器学习模型参数的有效推断