Overfitting has a limitation: a model-independent generalization error bound based on R\'enyi entropy

Suzuki, Atsushi

Statistics > Machine Learning

arXiv:2506.00182 (stat)

[Submitted on 30 May 2025 ]

Title: Overfitting has a limitation: a model-independent generalization error bound based on Rényi entropy

Title: 过拟合有一个局限性：基于Rényi熵的模型无关泛化误差界

Authors:Atsushi Suzuki

Abstract: Will further scaling up of machine learning models continue to bring success? A significant challenge in answering this question lies in understanding generalization error, which is the impact of overfitting. Understanding generalization error behavior of increasingly large-scale machine learning models remains a significant area of investigation, as conventional analyses often link error bounds to model complexity, failing to fully explain the success of extremely large architectures. This research introduces a novel perspective by establishing a model-independent upper bound for generalization error applicable to algorithms whose outputs are determined solely by the data's histogram, such as empirical risk minimization or gradient-based methods. Crucially, this bound is shown to depend only on the R\'enyi entropy of the data-generating distribution, suggesting that a small generalization error can be maintained even with arbitrarily large models, provided the data quantity is sufficient relative to this entropy. This framework offers a direct explanation for the phenomenon where generalization performance degrades significantly upon injecting random noise into data, where the performance degrade is attributed to the consequent increase in the data distribution's R\'enyi entropy. Furthermore, we adapt the no-free-lunch theorem to be data-distribution-dependent, demonstrating that an amount of data corresponding to the R\'enyi entropy is indeed essential for successful learning, thereby highlighting the tightness of our proposed generalization bound.

Abstract: 大规模机器学习模型的进一步扩展是否将继续带来成功？回答这个问题的一个重大挑战在于理解泛化误差，即过拟合的影响。理解越来越大规模的机器学习模型的泛化误差行为仍然是一个重要的研究领域，因为传统的分析通常将误差界限与模型复杂度联系起来，未能充分解释极大规模架构的成功。本研究通过建立适用于仅由数据直方图决定算法输出（如经验风险最小化或基于梯度的方法）的泛化误差独立于模型的上界，引入了一个新的视角。至关重要的是，该界表明它仅依赖于数据生成分布的 Rényi 熵，这意味着只要数据量相对于此熵足够大，即使模型规模任意增大，也可以保持小的泛化误差。此框架直接解释了向数据注入随机噪声时泛化性能显著下降的现象，其中性能下降归因于数据分布的 Rényi 熵随之增加。此外，我们将无免费午餐定理适应为与数据分布相关，证明了对应于 Rényi 熵的数据量确实是成功学习所必需的，从而突显了我们提出的泛化界的有效性。

Subjects:	Machine Learning (stat.ML) ; Information Theory (cs.IT); Machine Learning (cs.LG); Statistics Theory (math.ST)
Cite as:	arXiv:2506.00182 [stat.ML]
	(or arXiv:2506.00182v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2506.00182

Submission history

From: Atsushi Suzuki [view email]
[v1] Fri, 30 May 2025 19:41:37 UTC (348 KB)

Statistics > Machine Learning

Title: Overfitting has a limitation: a model-independent generalization error bound based on Rényi entropy

Title: 过拟合有一个局限性：基于Rényi熵的模型无关泛化误差界

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title: Overfitting has a limitation: a model-independent generalization error bound based on Rényi entropy Show Chinese title

Title: 过拟合有一个局限性：基于Rényi熵的模型无关泛化误差界

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Overfitting has a limitation: a model-independent generalization error bound based on Rényi entropy