Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cond-mat > arXiv:2510.18435

Help | Advanced Search

Condensed Matter > Disordered Systems and Neural Networks

arXiv:2510.18435 (cond-mat)
[Submitted on 21 Oct 2025 ]

Title: Overparametrization bends the landscape: BBP transitions at initialization in simple Neural Networks

Title: 过参数化扭曲了景观:简单神经网络初始化时的BBP转变

Authors:Brandon Livio Annesi, Dario Bocchi, Chiara Cammarota
Abstract: High-dimensional non-convex loss landscapes play a central role in the theory of Machine Learning. Gaining insight into how these landscapes interact with gradient-based optimization methods, even in relatively simple models, can shed light on this enigmatic feature of neural networks. In this work, we will focus on a prototypical simple learning problem, which generalizes the Phase Retrieval inference problem by allowing the exploration of overparametrized settings. Using techniques from field theory, we analyze the spectrum of the Hessian at initialization and identify a Baik-Ben Arous-P\'ech\'e (BBP) transition in the amount of data that separates regimes where the initialization is informative or uninformative about a planted signal of a teacher-student setup. Crucially, we demonstrate how overparameterization can bend the loss landscape, shifting the transition point, even reaching the information-theoretic weak-recovery threshold in the large overparameterization limit, while also altering its qualitative nature. We distinguish between continuous and discontinuous BBP transitions and support our analytical predictions with simulations, examining how they compare to the finite-N behavior. In the case of discontinuous BBP transitions strong finite-N corrections allow the retrieval of information at a signal-to-noise ratio (SNR) smaller than the predicted BBP transition. In these cases we provide estimates for a new lower SNR threshold that marks the point at which initialization becomes entirely uninformative.
Abstract: 高维非凸损失景观在机器学习理论中起着核心作用。 了解这些景观如何与基于梯度的优化方法相互作用,即使在相对简单的模型中,也可以揭示神经网络这一神秘特征。 在本工作中,我们将关注一个典型的简单学习问题,该问题通过允许探索过参数化设置来推广相位检索推理问题。 使用场论技术,我们分析了初始化时的Hessian谱,并确定了数据量的BBP转变,该转变区分了初始化对教师-学生设置中植入信号有信息或无信息的区域。 关键的是,我们展示了过参数化如何弯曲损失景观,改变转变点,甚至在大过参数化极限下达到信息理论上的弱恢复阈值,同时改变其定性性质。 我们区分了连续和不连续的BBP转变,并通过模拟支持我们的分析预测,研究它们与有限N行为的比较。 在不连续BBP转变的情况下,强烈的有限N修正允许在信噪比(SNR)小于预测的BBP转变时恢复信息。 在这些情况下,我们提供了新的更低SNR阈值的估计,该阈值标记了初始化完全无信息的点。
Comments: 22 pages, 7 figures
Subjects: Disordered Systems and Neural Networks (cond-mat.dis-nn) ; Statistical Mechanics (cond-mat.stat-mech); Spectral Theory (math.SP); Machine Learning (stat.ML)
Cite as: arXiv:2510.18435 [cond-mat.dis-nn]
  (or arXiv:2510.18435v1 [cond-mat.dis-nn] for this version)
  https://doi.org/10.48550/arXiv.2510.18435
arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Dario Bocchi [view email]
[v1] Tue, 21 Oct 2025 09:08:58 UTC (2,920 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
view license
Current browse context:
cond-mat.dis-nn
< prev   |   next >
new | recent | 2025-10
Change to browse by:
cond-mat
cond-mat.stat-mech
math
math.SP
stat
stat.ML

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号