Optimization for Neural Operators can Benefit from Width

Cisneros-Velarde, Pedro; Shrimali, Bhavesh; Banerjee, Arindam

计算机科学 > 机器学习

arXiv:2502.00705 (cs)

[提交于 2025年2月2日 ]

标题：神经算子的优化可以从宽度中受益

标题： Optimization for Neural Operators can Benefit from Width

Authors:Pedro Cisneros-Velarde, Bhavesh Shrimali, Arindam Banerjee

摘要：直接学习函数空间之间映射的神经算子，如深度算子网络（DONs）和傅里叶神经算子（FNOs），已经引起了广泛关注。尽管DONs和FNOs具有通用逼近保证，但目前尚无使用梯度下降（GD）学习此类网络的优化收敛保证。在本文中，我们通过提出一个基于GD的优化统一框架来解决这个开放问题，并将其应用于建立DONs和FNOs的收敛保证。特别是，我们证明了这两种神经算子相关的损失满足两个条件——受限强凸性（RSC）和光滑性——这些条件保证由于GD导致其损失值的减少。值得注意的是，由于各自模型架构的不同，这两个条件分别适用于每个神经算子。理论的一个重要结论是，更宽的网络应该能为DONs和FNOs带来更好的优化收敛性。我们展示了在典型算子学习问题上的实验结果，以支持我们的理论结果。

摘要： Neural Operators that directly learn mappings between function spaces, such as Deep Operator Networks (DONs) and Fourier Neural Operators (FNOs), have received considerable attention. Despite the universal approximation guarantees for DONs and FNOs, there is currently no optimization convergence guarantee for learning such networks using gradient descent (GD). In this paper, we address this open problem by presenting a unified framework for optimization based on GD and applying it to establish convergence guarantees for both DONs and FNOs. In particular, we show that the losses associated with both of these neural operators satisfy two conditions -- restricted strong convexity (RSC) and smoothness -- that guarantee a decrease on their loss values due to GD. Remarkably, these two conditions are satisfied for each neural operator due to different reasons associated with the architectural differences of the respective models. One takeaway that emerges from the theory is that wider networks should lead to better optimization convergence for both DONs and FNOs. We present empirical results on canonical operator learning problems to support our theoretical results.

主题：	机器学习 (cs.LG) ; 优化与控制 (math.OC)
引用方式：	arXiv:2502.00705 [cs.LG]
	(或者 arXiv:2502.00705v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2502.00705

提交历史

来自： Pedro Cisneros-Velarde [查看电子邮件]
[v1] 星期日， 2025 年 2 月 2 日 07:33:00 UTC (12,712 KB)

计算机科学 > 机器学习

标题：神经算子的优化可以从宽度中受益

标题： Optimization for Neural Operators can Benefit from Width

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 神经算子的优化可以从宽度中受益 显示英文标题

标题： Optimization for Neural Operators can Benefit from Width

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：神经算子的优化可以从宽度中受益