Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > stat > arXiv:2506.00182

Help | Advanced Search

Statistics > Machine Learning

arXiv:2506.00182 (stat)
[Submitted on 30 May 2025 ]

Title: Overfitting has a limitation: a model-independent generalization error bound based on Rényi entropy

Title: 过拟合有一个局限性:基于Rényi熵的模型无关泛化误差界

Authors:Atsushi Suzuki
Abstract: Will further scaling up of machine learning models continue to bring success? A significant challenge in answering this question lies in understanding generalization error, which is the impact of overfitting. Understanding generalization error behavior of increasingly large-scale machine learning models remains a significant area of investigation, as conventional analyses often link error bounds to model complexity, failing to fully explain the success of extremely large architectures. This research introduces a novel perspective by establishing a model-independent upper bound for generalization error applicable to algorithms whose outputs are determined solely by the data's histogram, such as empirical risk minimization or gradient-based methods. Crucially, this bound is shown to depend only on the R\'enyi entropy of the data-generating distribution, suggesting that a small generalization error can be maintained even with arbitrarily large models, provided the data quantity is sufficient relative to this entropy. This framework offers a direct explanation for the phenomenon where generalization performance degrades significantly upon injecting random noise into data, where the performance degrade is attributed to the consequent increase in the data distribution's R\'enyi entropy. Furthermore, we adapt the no-free-lunch theorem to be data-distribution-dependent, demonstrating that an amount of data corresponding to the R\'enyi entropy is indeed essential for successful learning, thereby highlighting the tightness of our proposed generalization bound.
Abstract: 大规模机器学习模型的进一步扩展是否将继续带来成功? 回答这个问题的一个重大挑战在于理解泛化误差,即过拟合的影响。 理解越来越大规模的机器学习模型的泛化误差行为仍然是一个重要的研究领域,因为传统的分析通常将误差界限与模型复杂度联系起来,未能充分解释极大规模架构的成功。 本研究通过建立适用于仅由数据直方图决定算法输出(如经验风险最小化或基于梯度的方法)的泛化误差独立于模型的上界,引入了一个新的视角。 至关重要的是,该界表明它仅依赖于数据生成分布的 Rényi 熵,这意味着只要数据量相对于此熵足够大,即使模型规模任意增大,也可以保持小的泛化误差。 此框架直接解释了向数据注入随机噪声时泛化性能显著下降的现象,其中性能下降归因于数据分布的 Rényi 熵随之增加。 此外,我们将无免费午餐定理适应为与数据分布相关,证明了对应于 Rényi 熵的数据量确实是成功学习所必需的,从而突显了我们提出的泛化界的有效性。
Subjects: Machine Learning (stat.ML) ; Information Theory (cs.IT); Machine Learning (cs.LG); Statistics Theory (math.ST)
Cite as: arXiv:2506.00182 [stat.ML]
  (or arXiv:2506.00182v1 [stat.ML] for this version)
  https://doi.org/10.48550/arXiv.2506.00182
arXiv-issued DOI via DataCite

Submission history

From: Atsushi Suzuki [view email]
[v1] Fri, 30 May 2025 19:41:37 UTC (348 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
view license
Current browse context:
cs.IT
< prev   |   next >
new | recent | 2025-06
Change to browse by:
cs
cs.LG
math
math.IT
math.ST
stat
stat.ML
stat.TH

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号