Skip to main content
CenXiv.org
此网站处于试运行阶段,支持我们!
我们衷心感谢所有贡献者的支持。
贡献
赞助
cenxiv logo > math.NA

帮助 | 高级搜索

数值分析

  • 新提交
  • 交叉列表
  • 替换

查看 最近的 文章

显示 2025年07月21日, 星期一 新的列表

总共 25 条目
显示最多 2000 每页条目: 较少 | 更多 | 所有

新提交 (展示 10 之 10 条目 )

[1] arXiv:2507.13480 [中文pdf, pdf, html, 其他]
标题: 多分辨率局部平滑度检测在非均匀采样多变量信号中的应用
标题: Multiresolution local smoothness detection in non-uniformly sampled multivariate signals
Sara Avesani, Gianluca Giacchi, Michael Multerer
主题: 数值分析 (math.NA) ; 计算机视觉与模式识别 (cs.CV) ; 机器学习 (cs.LG)

受基于小波系数衰减行为的边缘检测启发,我们引入了一个(近似)线性时间算法,用于检测非均匀采样的多变量信号中的局部正则性。 我们的方法在Jaffard提出的微局部空间框架内量化正则性。 我们分析的核心工具是快速样本小波变换,这是一种针对散乱数据的分布小波变换。 我们建立了样本小波系数的衰减与多变量信号的逐点正则性之间的联系。 作为副产品,我们推导了属于经典霍尔德空间和索博列夫-斯洛博德基克空间的函数的衰减估计。 虽然传统小波在低维结构化数据的正则性检测中效果显著,但样本小波即使在高维和散乱数据中也表现出稳健的性能。 为了说明我们的理论结果,我们展示了广泛的数值研究,用于检测一维、二维和三维信号的局部正则性,从非均匀采样的时间序列到图像分割,再到点云中的边缘检测。

Inspired by edge detection based on the decay behavior of wavelet coefficients, we introduce a (near) linear-time algorithm for detecting the local regularity in non-uniformly sampled multivariate signals. Our approach quantifies regularity within the framework of microlocal spaces introduced by Jaffard. The central tool in our analysis is the fast samplet transform, a distributional wavelet transform tailored to scattered data. We establish a connection between the decay of samplet coefficients and the pointwise regularity of multivariate signals. As a by product, we derive decay estimates for functions belonging to classical H\"older spaces and Sobolev-Slobodeckij spaces. While traditional wavelets are effective for regularity detection in low-dimensional structured data, samplets demonstrate robust performance even for higher dimensional and scattered data. To illustrate our theoretical findings, we present extensive numerical studies detecting local regularity of one-, two- and three-dimensional signals, ranging from non-uniformly sampled time series over image segmentation to edge detection in point clouds.

[2] arXiv:2507.13516 [中文pdf, pdf, html, 其他]
标题: 先验误差分析的邻近伽辽金方法
标题: A priori error analysis of the proximal Galerkin method
Brendan Keith, Rami Masri, Marius Zeinhofer
主题: 数值分析 (math.NA)

邻近伽辽金(PG)方法是一种用于求解带有不等式约束的变分问题的有限元方法。 它具有多个优点,包括保持约束的近似和网格无关性。 本文给出了PG方法的第一个抽象先验误差分析,提供了一个通用框架来建立收敛性和误差估计。 作为该框架的应用,我们展示了使用各种有限元子空间的障碍问题和Signorini问题的最佳收敛速率。

The proximal Galerkin (PG) method is a finite element method for solving variational problems with inequality constraints. It has several advantages, including constraint-preserving approximations and mesh independence. This paper presents the first abstract a priori error analysis of PG methods, providing a general framework to establish convergence and error estimates. As applications of the framework, we demonstrate optimal convergence rates for both the obstacle and Signorini problems using various finite element subspaces.

[3] arXiv:2507.13589 [中文pdf, pdf, html, 其他]
标题: 接触镜佩戴时眼表变化的量化
标题: Quantifying Ocular Surface Changes with Contact Lens Wear
Lucia Carichino, Kara L. Maki, David S. Ross, Riley K. Supple, Evan Rysdam
评论: 35页和14张图,已提交
主题: 数值分析 (math.NA) ; 生物物理 (physics.bio-ph)

全球有超过1.4亿人,美国有超过4500万人佩戴隐形眼镜;估计有12%-27.4%的隐形眼镜使用者由于不适而停止佩戴。 隐形眼镜与眼表的机械相互作用已被发现会影响眼表。 在临床环境中,隐形眼镜与眼睛之间的机械相互作用难以测量和计算,该领域的研究有限。 本文提出了第一个将隐形眼镜与开放眼睛之间的相互作用耦合的数学模型,其中隐形眼镜的配置、隐形眼镜的吸力压力以及变形的眼球形状都是模型的涌现特性。 通过假设隐形眼镜下的吸力压力直接作用于眼表,忽略镜片后泪液层,实现了隐形眼镜与眼睛之间的非线性耦合。 隐形眼镜的动力学使用了一个之前发表的模型进行建模。 我们考虑了均质和非均质线弹性眼睛模型,不同的眼球形状,不同的镜片形状和镜片厚度分布,并提取了所有考虑情况下的镜片变形、镜片吸力压力分布以及眼球变形和应力。 该模型预测在眼球中心和角膜缘/巩膜区域有更高的眼球变形和应力。 考虑非均质材料的眼部参数会增加这些变形和应力。 随着隐形眼镜刚度的增加,眼位移和应力呈非线性增加。 在眼睛上插入更陡峭的隐形眼镜会导致眼球中心的位移减少,而隐形眼镜边缘的位移增大。 模型预测结果与实验数据和之前开发的数学模型进行了比较。

Over 140 million people worldwide and over 45 million people in the United states wear contact lenses; it is estimated 12%-27.4% contact lens users stop wearing them due to discomfort. Contact lens mechanical interactions with the ocular surface have been found to affect the ocular surface. The mechanical interactions between the contact lens and the eye are difficult to measure and calculate in the clinical setting, and the research in this field is limited. This paper presents the first mathematical model that couples the interaction between the contact lens and the open eye, where the contact lens configuration, the contact lens suction pressure, and the deformed ocular shape are all emergent properties of the model. The non-linear coupling between the contact lens and the eye is achieved assuming the the suction pressure under the lens is applied directly to the ocular surface, neglecting the post-lens tear film layer. The contact lens dynamics is modeled using a previous published model. We consider a homogeneous and a heterogeneous linear elastic eye model, different ocular shapes, different lens shapes and lens thickness profiles, and extract lens deformation, lens suction pressure profiles, and ocular deformations and stresses for all the scenarios considered. The model predicts higher ocular deformations and stresses at the center of the eye and in the limbal/scleral region. Accounting for a heterogeneous material eye parameters increases such deformations and stresses. The ocular displacements and stresses increase non-linearly as we increase the stiffness of the contact lens. Inserting a steeper contact lens on the eye results in a reduction of the ocular displacement at the center of the eye and a larger displacement at the edge of the contact lens. The model predictions are compared to experimental data and previously developed mathematical models.

[4] arXiv:2507.13640 [中文pdf, pdf, 其他]
标题: 多项式空间中的插值
标题: Interpolation in Polynomial Spaces of p-Degree
Phil-Alexander Hofmann, Damar Wicaksono, Michael Hecht
主题: 数值分析 (math.NA)

我们最近引入了快速牛顿变换(FNT),这是一种在空间维数为$m$的下闭多项式空间中执行多变量牛顿插值的算法。 在本工作中,我们分析了 FNT 在特定的一类下闭集合$A_{m,n,p}$的上下文中,这些集合定义为所有满足$\ell^p$范数小于$n$的多指标,其中$p \in [0,\infty]$。 这些集合诱导出下闭多项式空间$\Pi_{m,n,p}$,在该空间中,FNT 算法的时间复杂度为 $\mathcal{O}(|A_{m,n,p}|mn)$。 我们证明,在这种设置下,与张量积空间相比,复杂度提高了$\rho_{m,n,p}$倍,当$m \lesssim n^p$增加时,这种改进以超指数方式衰减。 此外,我们展示了 FNT 所采用的分层方案的构造,并展示了其在敏感性分析中计算活动分数的性能。

We recently introduced the Fast Newton Transform (FNT), an algorithm for performing multivariate Newton interpolation in downward closed polynomial spaces of spatial dimension $m$. In this work, we analyze the FNT in the context of a specific family of downward closed sets $A_{m,n,p}$, defined as all multi-indices with $\ell^p$ norm less than $n$ with $p \in [0,\infty]$. These sets induce the downward closed polynomial space $\Pi_{m,n,p}$, within which the FNT algorithm achieves a time complexity of $\mathcal{O}(|A_{m,n,p}|mn)$. We show that this setting, compared to tensor product spaces, yields an improvement in complexity by a factor $\rho_{m,n,p}$, which decays super exponentially with increasing spatial dimension when $m \lesssim n^p$. Additionally, we demonstrate the construction of the hierarchical scheme employed by the FNT and showcase its performance to compute activity scores in sensitivity analysis.

[5] arXiv:2507.13644 [中文pdf, pdf, html, 其他]
标题: 多物理场嵌入局部正交分解用于热机械耦合问题
标题: Multiphysics embedding localized orthogonal decomposition for thermomechanical coupling problems
Yuzhou Nan, Yajun Wang, Changqing Ye, Xiaofei Guan
主题: 数值分析 (math.NA)

多尺度建模和分析高度非均质介质中的多物理场耦合过程面临重大挑战。 在本文中,我们提出了一种新颖的多物理场嵌入局部正交分解(ME-LOD)方法,用于解决热力学耦合问题,该方法还提供了一种系统的方法来处理多物理系统中的复杂耦合效应。 与标准局部正交分解(LOD)方法为每个物理场构建独立的多尺度空间不同,所提出的方法对位移和温度进行了统一构造。 与标准LOD方法相比,我们的方法通过正交化实现算子稳定性重构,同时保持计算效率。 几个数值实验表明,ME-LOD方法在准确性方面优于传统LOD方法,特别是在材料特性存在显著差异的情况下。

Multiscale modeling and analysis of multiphysics coupling processes in highly heterogeneous media present significant challenges. In this paper, we propose a novel multiphysics embedding localized orthogonal decomposition (ME-LOD) method for solving thermomechanical coupling problems, which also provides a systematic approach to address intricate coupling effects in multiphysical systems. Unlike the standard localized orthogonal decomposition (LOD) method that constructs separate multiscale spaces for each physical field, the proposed method features a unified construction for both displacement and temperature. Compared to the standard LOD method, our approach achieves operator stability reconstruction through orthogonalization while preserving computational efficiency. Several numerical experiments demonstrate that the ME-LOD method outperforms the traditional LOD method in accuracy, particularly in cases with significant contrasts in material properties.

[6] arXiv:2507.13731 [中文pdf, pdf, html, 其他]
标题: 高效随机算法用于四元数矩阵的低秩逼近
标题: Pass-efficient Randomized Algorithms for Low-rank Approximation of Quaternion Matrices
Salman Ahmadi-Asl, Malihe Nobakht Kooshkghazi, Valentin Leplat
主题: 数值分析 (math.NA)

随机化算法用于四元数矩阵的低秩逼近近年来受到越来越多的关注。然而,现有方法忽略了遍历效率,即限制对输入矩阵的遍历次数的能力——这在由通信成本主导的现代计算环境中至关重要。我们通过提出一系列遍历高效的随机算法来弥补这一差距,使用户能够直接将遍历预算与逼近精度进行权衡。我们的贡献包括:(i) 一种针对四元数矩阵低秩逼近的任意遍历随机算法族,在用户指定的矩阵视图数量下运行,以及 (ii) 一种遍历高效的块Krylov子空间方法,可加速谱衰减缓慢的矩阵的收敛。此外,我们建立了谱范数误差界,表明期望的逼近误差随着遍历次数的增加呈指数衰减。最后,我们通过广泛的数值实验验证了我们的框架,并展示了其在多个应用中的实际相关性,包括四元数数据压缩、矩阵补全、图像超分辨率和深度学习。

Randomized algorithms for low-rank approximation of quaternion matrices have gained increasing attention in recent years. However, existing methods overlook pass efficiency, the ability to limit the number of passes over the input matrix-which is critical in modern computing environments dominated by communication costs. We address this gap by proposing a suite of pass-efficient randomized algorithms that let users directly trade pass budget for approximation accuracy. Our contributions include: (i) a family of arbitrary-pass randomized algorithms for low-rank approximation of quaternion matrices that operate under a user-specified number of matrix views, and (ii) a pass-efficient extension of block Krylov subspace methods that accelerates convergence for matrices with slowly decaying spectra. Furthermore, we establish spectral norm error bounds showing that the expected approximation error decays exponentially with the number of passes. Finally, we validate our framework through extensive numerical experiments and demonstrate its practical relevance across multiple applications, including quaternionic data compression, matrix completion, image super-resolution, and deep learning.

[7] arXiv:2507.13836 [中文pdf, pdf, html, 其他]
标题: 牛顿法用于非线性映射到向量丛 第二部分:变分问题的应用
标题: Newton's method for nonlinear mappings into vector bundles Part II: Application to variational problems
Laura Weigl, Ronny Bergmann, Anton Schiela
主题: 数值分析 (math.NA) ; 微分几何 (math.DG)

我们考虑通过牛顿法求解流形上的变分方程。 这些问题可以表示为从无限维流形到对偶向量丛的映射的根查找问题。 我们推导了实现牛顿法所需的微分几何工具,配备了一种仿射协变阻尼策略。 我们将牛顿法应用于几个变分问题并展示了数值结果。

We consider the solution of variational equations on manifolds by Newton's method. These problems can be expressed as root finding problems for mappings from infinite dimensional manifolds into dual vector bundles. We derive the differential geometric tools needed for the realization of Newton's method, equipped with an affine covariant damping strategy. We apply Newton's method to a couple of variational problems and show numerical results.

[8] arXiv:2507.13855 [中文pdf, pdf, html, 其他]
标题: 一种用于求解非线性方程组的随机列块梯度下降方法
标题: A stochastic column-block gradient descent method for solving nonlinear systems of equations
Naiyu Jiang, Wendi Bao, Lili Xing, Weiguo Li
主题: 数值分析 (math.NA)

在本文中,我们提出了一种新的随机列块梯度下降方法来求解非线性方程组。 它具有下降方向,并通过一个优化问题获得近似最优步长。 我们提供了详尽的收敛性分析,并推导了新方法的收敛速率的上界。 数值实验表明,所提出的方法优于现有方法。

In this paper, we propose a new stochastic column-block gradient descent method for solving nonlinear systems of equations. It has a descent direction and holds an approximately optimal step size obtained through an optimization problem. We provide a thorough convergence analysis, and derive an upper bound for the convergence rate of the new method. Numerical experiments demonstrate that the proposed method outperforms the existing ones.

[9] arXiv:2507.13902 [中文pdf, pdf, html, 其他]
标题: 粗糙壁面斯托克斯流的深度微分求解器在异质多尺度方法中的应用
标题: Deep Micro Solvers for Rough-Wall Stokes Flow in a Heterogeneous Multiscale Method
Emanuel Ström, Anna-Karin Tornberg, Ozan Öktem
主题: 数值分析 (math.NA) ; 机器学习 (stat.ML)

我们提出了一种用于粗糙壁面Stokes流的异构多尺度方法(HMM)的学习预计算。使用傅里叶神经算子来近似流体微观子集上的局部平均值,从而能够在远离粗糙度的地方计算流体的有效滑移长度。该网络设计为从局部壁面几何结构映射到相应局部流平均值的Riesz表示。通过这种参数化,网络仅依赖于局部壁面几何结构,因此可以独立于边界条件进行训练。我们对统计误差传播进行了详细的理论分析,并证明在适当的正则性和缩放假设下,有界的训练损失会导致宏观流动结果中的有界误差。然后我们在一组测试问题上展示了学习的预计算在粗糙度尺度上的稳定性。HMM求解宏观流动的准确性与使用经典方法求解局部(微观)问题时相当,而求解微观问题的计算成本显著降低。

We propose a learned precomputation for the heterogeneous multiscale method (HMM) for rough-wall Stokes flow. A Fourier neural operator is used to approximate local averages over microscopic subsets of the flow, which allows to compute an effective slip length of the fluid away from the roughness. The network is designed to map from the local wall geometry to the Riesz representors for the corresponding local flow averages. With such a parameterisation, the network only depends on the local wall geometry and as such can be trained independent of boundary conditions. We perform a detailed theoretical analysis of the statistical error propagation, and prove that under suitable regularity and scaling assumptions, a bounded training loss leads to a bounded error in the resulting macroscopic flow. We then demonstrate on a family of test problems that the learned precomputation performs stably with respect to the scale of the roughness. The accuracy in the HMM solution for the macroscopic flow is comparable to when the local (micro) problems are solved using a classical approach, while the computational cost of solving the micro problems is significantly reduced.

[10] arXiv:2507.13955 [中文pdf, pdf, 其他]
标题: 三维拉普拉斯和赫尔姆霍茨方程曲边界元方法的收敛率
标题: Convergence rates of curved boundary element methods for the 3D Laplace and Helmholtz equations
Luiz Maltez Faria, Pierre Marchand, Hadrien Montanelli
主题: 数值分析 (math.NA)

我们建立了改进的收敛率,用于三维(3D)拉普拉斯和赫姆霍兹方程在光滑几何和数据下的曲边界元方法。 我们的分析依赖于对扰动双线性和共轭双线性形式引入的一致性误差的精确分析。 我们通过基于基函数和四阶曲三角形单元的三维数值实验来说明我们的结果。

We establish improved convergence rates for curved boundary element methods applied to the three-dimensional (3D) Laplace and Helmholtz equations with smooth geometry and data. Our analysis relies on a precise analysis of the consistency errors introduced by the perturbed bilinear and sesquilinear forms. We illustrate our results with numerical experiments in 3D based on basis functions and curved triangular elements up to order four.

交叉提交 (展示 5 之 5 条目 )

[11] arXiv:2507.10739 (交叉列表自 quant-ph) [中文pdf, pdf, html, 其他]
标题: 量子波原子变换
标题: Quantum Wave Atom Transforms
Marianna Podzorova, Yi-Kai Liu
评论: 45页,12图
主题: 量子物理 (quant-ph)

本文构建了具有树结构的子波包变换的第一个量子算法,有时称为子波原子变换。 经典上,子波原子用于构造微分算子的稀疏表示,这使得偏微分方程的快速数值算法成为可能。 与之前的工作相比,我们的量子算法通过使用更大类可能树结构的有效表示,可以实现更大类的子波和子波原子变换。 我们的量子实现对于维度$2^n$的变换具有$O(\mathrm{poly}(n))$门复杂度,而经典实现具有$O(n 2^n)$浮点运算。 该结果可用于改进现有求解双曲偏微分方程的量子算法。

This paper constructs the first quantum algorithm for wavelet packet transforms with a tree structure, sometimes called wave atom transforms. Classically, wave atoms are used to construct sparse representations of differential operators, which enable fast numerical algorithms for partial differential equations. Compared to previous work, our quantum algorithm can implement a larger class of wavelet and wave atom transforms, by using an efficient representation for a larger class of possible tree structures. Our quantum implementation has $O(\mathrm{poly}(n))$ gate complexity for the transform of dimension $2^n$, while classical implementations have $O(n 2^n)$ floating point operations. The result can be used to improve existing quantum algorithms for solving hyperbolic partial differential equations.

[12] arXiv:2507.13459 (交叉列表自 cs.CE) [中文pdf, pdf, html, 其他]
标题: 接触可变形体的图神经网络代理模型及其必要充分的接触检测
标题: Graph Neural Network Surrogates for Contacting Deformable Bodies with Necessary and Sufficient Contact Detection
Vijay K. Dubey (1), Collin E. Haese (1), Osman Gültekin (1), David Dalton (2), Manuel K. Rausch (1), Jan N. Fuhg (1) ((1) The University of Texas at Austin, (2) University of Glasgow)
主题: 计算工程、金融与科学 (cs.CE) ; 人工智能 (cs.AI) ; 机器学习 (cs.LG) ; 数值分析 (math.NA)

用于机械中非线性边值问题快速推断的代理模型在广泛的工程应用中有帮助。 然而,涉及可变形体接触的应用的有效代理建模,特别是在不同几何形状的背景下,仍然是一个开放问题。 特别是,现有方法仅限于刚体接触,或者最多是刚体和具有明确接触平面的柔软物体之间的接触。 此外,它们使用接触或碰撞检测过滤器,作为快速测试,但仅使用必要条件而非充分条件进行检测。 在本工作中,我们提出了一种图神经网络架构,该架构利用连续碰撞检测,并首次引入了为软可变形体之间接触设计的充分条件。 我们在两个基准测试中测试了其性能,包括一个预测生物人工主动脉瓣软组织力学闭合状态的问题。 我们发现,在损失函数中添加额外的接触项具有正则化效果,从而提高了网络的泛化能力。 这些好处适用于相似平面和单元法向角度的简单接触,以及不同平面和单元法向角度的复杂接触。 我们还证明了该框架可以处理不同的参考几何形状。 然而,这种好处伴随着训练期间的高计算成本,导致可能并不总是有利的权衡。 我们量化了在各种硬件架构上的训练成本和由此产生的推理加速。 重要的是,我们的图神经网络实现使我们基准问题的推理速度提高了多达一千倍。

Surrogate models for the rapid inference of nonlinear boundary value problems in mechanics are helpful in a broad range of engineering applications. However, effective surrogate modeling of applications involving the contact of deformable bodies, especially in the context of varying geometries, is still an open issue. In particular, existing methods are confined to rigid body contact or, at best, contact between rigid and soft objects with well-defined contact planes. Furthermore, they employ contact or collision detection filters that serve as a rapid test but use only the necessary and not sufficient conditions for detection. In this work, we present a graph neural network architecture that utilizes continuous collision detection and, for the first time, incorporates sufficient conditions designed for contact between soft deformable bodies. We test its performance on two benchmarks, including a problem in soft tissue mechanics of predicting the closed state of a bioprosthetic aortic valve. We find a regularizing effect on adding additional contact terms to the loss function, leading to better generalization of the network. These benefits hold for simple contact at similar planes and element normal angles, and complex contact at differing planes and element normal angles. We also demonstrate that the framework can handle varying reference geometries. However, such benefits come with high computational costs during training, resulting in a trade-off that may not always be favorable. We quantify the training cost and the resulting inference speedups on various hardware architectures. Importantly, our graph neural network implementation results in up to a thousand-fold speedup for our benchmark problems at inference.

[13] arXiv:2507.13475 (交叉列表自 math.OC) [中文pdf, pdf, html, 其他]
标题: 能量最小化的扩展自然神经梯度流
标题: Expansive Natural Neural Gradient Flows for Energy Minimization
Wolfgang Dahmen, Wuchen Li, Yuankai Teng, Zhu Wang
评论: 40页,19图
主题: 优化与控制 (math.OC) ; 数值分析 (math.NA)

本文开发了深度神经网络诱导映射空间中的扩展梯度动力学。 具体而言,我们生成了在抽象希尔伯特空间设置中最小化一类能量泛函的工具和概念,涵盖了广泛的适用范围,如基于偏微分方程的反问题和监督学习。 该方法依赖于全微分同胚映射空间中的希尔伯特空间度量,可以视为广义的Wasserstein-2度量。 然后我们在深度神经网络参数化集合中研究了一种投影梯度下降方法。 更重要的是,我们开发了一种适应和扩展策略,逐步扩大深度神经网络结构。 特别是,扩展机制旨在尽可能增强由神经流形诱导的自然梯度方向与理想希尔伯特空间梯度下降方向的对齐,利用我们可以评估希尔伯特空间梯度投影的事实。 我们展示了所提出策略在几种简单模型问题中的有效性,这些模型问题的能量出现在监督学习、模型降阶或反问题的背景下。 特别是,我们强调了基于环境希尔伯特空间内积组装神经流矩阵的重要性。 实际算法是基于更广泛讨论的更广谱范围的最简单规范,详细分析将推迟到未来的工作中。

This paper develops expansive gradient dynamics in deep neural network-induced mapping spaces. Specifically, we generate tools and concepts for minimizing a class of energy functionals in an abstract Hilbert space setting covering a wide scope of applications such as PDEs-based inverse problems and supervised learning. The approach hinges on a Hilbert space metric in the full diffeomorphism mapping space, which could be viewed as a generalized Wasserstein-2 metric. We then study a projection gradient descent method within deep neural network parameterized sets. More importantly, we develop an adaptation and expanding strategy to step-by-step enlarge the deep neural network structures. In particular, the expansion mechanism aims to enhance the alignment of the neural manifold induced natural gradient direction as well as possible with the ideal Hilbert space gradient descent direction leveraging the fact that we can evaluate projections of the Hilbert space gradient. We demonstrate the efficacy of the proposed strategy for several simple model problems for energies arising in the context of supervised learning, model reduction, or inverse problems. In particular, we highlight the importance of assembling the neural flow matrix based on the inner product for the ambient Hilbert space. The actual algorithms are the simplest specifications of a broader spectrum based on a correspondingly wider discussion, postponing a detailed analysis to forthcoming work.

[14] arXiv:2507.13492 (交叉列表自 physics.comp-ph) [中文pdf, pdf, html, 其他]
标题: 相场模型在增材制造中晶粒生长的时间积分方法
标题: On the time integration for phase field modeling of grain growth in additive manufacturing
Chaoqian Yuan, Chinnapat Panwisawas, Ye Lu
主题: 计算物理 (physics.comp-ph) ; 数值分析 (math.NA)

相场模拟在理解增材制造中的微观结构演化中起着关键作用。然而,它们已被发现计算成本极高。原因之一是在快速凝固过程中需要解析复杂的微观结构演化,因此需要很小的时间步长。本文研究了使用一类稳定的时域积分算法来通过增大时间步长来加速此类相场模拟的可能性。基于专门用于模拟增材制造中快速凝固的相场模型,开发了具体的时域积分公式和能量稳定性理论分析。数值结果证实,所提出的方法可以在至少两个数量级更大的时间步长下确保相场模拟的数值稳定性和能量需求的减少。已经针对316L不锈钢的相关物理和动力学参数进行了二维和三维相场模拟。这项工作为高效的相场模拟提供了数值框架,并为大规模相场建模开辟了许多机会。

Phase field simulations play a key role in the understanding of microstructure evolution in additive manufacturing. However, they have been found extremely computationally expensive. One of the reasons is the small time step requirement to resolve the complex microstructure evolution during the rapid solidification process. This paper investigates the possibility of using a class of stabilized time integration algorithms to accelerate such phase field simulations by increasing the time steps. The specific time integration formulation and theoretical analysis on energy stability were developed, based on a phase field model dedicated to simulating rapid solidification in additive manufacturing. The numerical results confirmed that the proposed method can ensure the numerical stability and a decreasing energy requirement for the phase field simulations with at least two orders-of-magnitude larger time steps over conventional explicit methods. 2D and 3D phase field simulations have been conducted with relevant physical and kinetic parameters for 316L stainless steels. This work provides a numerical framework for efficient phase field simulations and open numerous opportunities for large scale phase field modeling.

[15] arXiv:2507.13804 (交叉列表自 math.OC) [中文pdf, pdf, html, 其他]
标题: 梯度下降使用一种简单的线搜索方法也能避开严格鞍点
标题: Gradient descent avoids strict saddles with a simple line-search method too
Andreea-Alexandra Muşat, Nicolas Boumal
评论: 38页
主题: 优化与控制 (math.OC) ; 动力系统 (math.DS) ; 数值分析 (math.NA)

已知在使用小的、固定的步长时,梯度下降(GD)在 $C^2$ 损失函数上通常会避开严格鞍点。 然而,对于使用线搜索方法的GD,没有这样的保证存在。 我们为标准Armijo回溯法的一个修改版本提供了这样的保证,该版本具有通用的、任意大的初始步长。 与之前的工作不同,我们的分析不需要梯度全局Lipschitz连续。 我们将这一结果扩展到黎曼设置(RGD),假设重投影是实解析的(尽管损失函数只需要是 $C^2$)。 最后,我们还改进了在某些情况下具有固定步长的RGD的保证。

It is known that gradient descent (GD) on a $C^2$ cost function generically avoids strict saddle points when using a small, constant step size. However, no such guarantee existed for GD with a line-search method. We provide one for a modified version of the standard Armijo backtracking method with generic, arbitrarily large initial step size. In contrast to previous works, our analysis does not require a globally Lipschitz gradient. We extend this to the Riemannian setting (RGD), assuming the retraction is real analytic (though the cost function still only needs to be $C^2$). In closing, we also improve guarantees for RGD with a constant step size in some scenarios.

替换提交 (展示 10 之 10 条目 )

[16] arXiv:1910.09297 (替换) [中文pdf, pdf, html, 其他]
标题: 两种高效的块预条件子用于质量守恒的Ohta-Kawasaki方程
标题: Two efficient block preconditioners for the mass-conserved Ohta-Kawasaki equation
Juan Zhang, Shifeng Li, Kai Jiang
评论: 28页,9图
主题: 数值分析 (math.NA)

在本文中,我们提出两种高效的块预条件子来求解具有有限元离散的守恒质量Ohta-Kawasaki方程。我们还研究了这两种预条件子的谱分布,\textit{即,}保角补预条件子和修改的厄米特和反厄米特分裂(MHSS简写)预条件子。此外,牛顿方法和皮卡德方法用于处理隐式非线性项。我们严格分析了牛顿方法的收敛性。最后,我们提供数值例子来支持理论分析,并表明所提出的预条件子在守恒质量Ohta-Kawasaki方程中的效率。

In this paper, we propose two efficient block preconditioners to solve the mass-conserved Ohta-Kawasaki equation with finite element discretization. We also study the spectral distribution of these two preconditioners, \textit{i.e.,} Schur complement preconditioner and the modified Hermitian and skew-Hermitian splitting (MHSS in short) preconditioner. Besides, Newton method and Picard method are used to address the implicitly nonlinear term. We rigorously analyze the convergence of Newton method. Finally, we offer numerical examples to support the theoretical analysis and indicate the efficiency of the proposed preconditioners for the mass-conserved Ohta-Kawasaki equation.

[17] arXiv:2310.15457 (替换) [中文pdf, pdf, html, 其他]
标题: 多网络渗流弹性模型的迭代解耦算法及其在脑水肿模拟中的应用
标题: An Iteratively Decoupled Algorithm for Multiple-Network Poroelastic Model with Applications in Brain Edema Simulations
Mingchao Cai, Meng Lei, Jingzhi Li, Jiaao Sun, Feng Wang
评论: 要提交的,替换旧版本
主题: 数值分析 (math.NA)

在本工作中,我们提出了一种迭代解耦算法,用于求解准静态多网络渗流弹性模型。 我们的方法采用基于总压的公式,以固体位移、总压和网络压力作为主要未知量。 这种重新表述将原始问题分解为广义斯托克斯问题和抛物线问题,提供了诸如减少弹性锁定效应和简化离散化等关键优势。 该算法保证无条件收敛到完全耦合系统的解。 数值实验展示了该方法在物理参数和离散化方面的准确性、效率和鲁棒性。 我们将该算法进一步应用于模拟脑水肿过程,展示了其在生物力学建模中的实际应用价值。

In this work, we present an iteratively decoupled algorithm for solving the quasi-static multiple-network poroelastic model. Our approach employs a total-pressure-based formulation with solid displacement, total pressure, and network pressures as primary unknowns. This reformulation decomposes the original problem into a generalized Stokes problem and a parabolic problem, offering key advantages such as reduced elastic locking effects and simplified discretization. The algorithm guarantees unconditional convergence to the solution of the fully coupled system. Numerical experiments demonstrate the accuracy, efficiency, and robustness of the method with respect to physical parameters and discretization. We further apply the algorithm to simulate the brain edema process, showcasing its practical utility in biomechanical modeling.

[18] arXiv:2501.02183 (替换) [中文pdf, pdf, html, 其他]
标题: 基于数据的用于算子推断的端口-哈密顿系统降阶模型
标题: Data-Driven Reduced-Order Models for Port-Hamiltonian Systems with Operator Inference
Yuwei Geng, Lili Ju, Boris Kramer, Zhu Wang
评论: 28页,13图
主题: 数值分析 (math.NA)

哈密顿算子推断在[Sharma, H., Wang, Z., Kramer, B., Physica D: 非线性现象, 431, p.133122, 2022]中被开发出来,用于学习保持结构的降阶模型(ROMs)用于哈密顿系统。 该方法仅使用数据和哈密顿函数形式的知识来构建低维模型。 得到的ROMs保持系统的内在结构,确保系统的机械和物理性质得以保持。 在这项工作中,我们将这种方法扩展到端口哈密顿系统,这些系统通过包括能量耗散、外部输入和输出来推广哈密顿系统。 基于系统状态和输出的快照,以及关于哈密顿函数形式的信息,通过优化推断出简化的算子,并用于构建数据驱动的ROMs。 为了进一步减轻评估ROMs中非线性项的复杂性,应用了通过离散经验插值的超简化方法。 相应地,我们推导了ROM对状态和输出近似的误差估计。 最后,我们通过线性质量-弹簧-阻尼问题和非线性Toda晶格问题的数值实验,展示了所提出的端口哈密顿算子推断框架的结构保持性以及准确性。

Hamiltonian operator inference has been developed in [Sharma, H., Wang, Z., Kramer, B., Physica D: Nonlinear Phenomena, 431, p.133122, 2022] to learn structure-preserving reduced-order models (ROMs) for Hamiltonian systems. The method constructs a low-dimensional model using only data and knowledge of the functional form of the Hamiltonian. The resulting ROMs preserve the intrinsic structure of the system, ensuring that the mechanical and physical properties of the system are maintained. In this work, we extend this approach to port-Hamiltonian systems, which generalize Hamiltonian systems by including energy dissipation, external input, and output. Based on snapshots of the system's state and output, together with the information about the functional form of the Hamiltonian, reduced operators are inferred through optimization and are then used to construct data-driven ROMs. To further alleviate the complexity of evaluating nonlinear terms in the ROMs, a hyper-reduction method via discrete empirical interpolation is applied. Accordingly, we derive error estimates for the ROM approximations of the state and output. Finally, we demonstrate the structure preservation, as well as the accuracy of the proposed port-Hamiltonian operator inference framework, through numerical experiments on a linear mass-spring-damper problem and a nonlinear Toda lattice problem.

[19] arXiv:2502.04589 (替换) [中文pdf, pdf, html, 其他]
标题: PASE:用于大规模特征值问题的大规模并行增强子空间求解器
标题: PASE: A Massively Parallel Augmented Subspace Eigensolver for Large Scale Eigenvalue Problems
Yangfei Liao, Haochen Liu, Hehu Xie, Zijing Wang
评论: 25页,6图
主题: 数值分析 (math.NA)

在本文中,我们提出了一种新颖的并行增强子空间方法,并构建了一个并行增强子空间求解器(PASE)包,用于通过大规模并行有限元离散化解决大规模特征值问题。 基于增强子空间,可以将高维特征值问题转化为在增强子空间上求解相应的线性方程组和低维特征值问题。 因此,通过增强子空间方法求解特征值问题的复杂度将与求解相同维度的线性方程组的复杂度相当。 为了提高可扩展性和效率,我们还提出了一些并行增强子空间方法的实现技术。 基于并行增强子空间方法和相关的实现技术,构建了一个用于求解大规模特征值问题的包 PASE。 提供了一些数值例子来验证所提出数值方法的效率和可扩展性。

In this paper, we present a novel parallel augmented subspace method and build a package Parallel Augmented Subspace Eigensolver (PASE) for solving large scale eigenvalue problems by the massively parallel finite element discretization. Based on the augmented subspace, solving high dimensional eigenvalue problems can be transformed to solving the corresponding linear equations and low dimensional eigenvalue problems on the augmented subspace. Thus the complexity of solving the eigenvalue problems by augmented subspace method will be comparable to that of solving the same dimensinal linear equations. In order to improve the scalability and efficiency, we also present some implementing techniques for the parallel augmented subspace method. Based on parallel augmented subspace method and the concerned implementing techniques, a package PASE is built for solving large scale eigenvalue problems. Some numerical examples are provided to validate the efficiency and scalability of the proposed numerical methods.

[20] arXiv:2502.06158 (替换) [中文pdf, pdf, html, 其他]
标题: 薛定谔方程的高效数值方法与高对比势场
标题: Efficient numerical method for the Schrödinger equation with high-contrast potentials
Xingguang Jin, Liu Liu, Xiang Zhong, Eric T. Chung
主题: 数值分析 (math.NA)

在本文中,我们研究半经典条件下和多尺度势函数的薛定谔方程。我们在时间上的Crank-Nicolson(CN)离散框架下,开发了所谓的约束能量最小化广义多尺度有限元方法(CEM-GMsFEM)。局部多尺度基函数是通过解决与哈密顿范数相关的谱问题和约束能量最小化问题来构建的。展示了我们数值方案在能量范数下的第一阶收敛性和在$L^2$范数下的第二阶收敛性,并提供了CEM-GMsFEM方法中的超采样数量、空间网格大小和半经典参数之间的关系。此外,我们证明了所提出的Crank-Nicolson CEM-GMsFEM方案的收敛性。 收敛需要$H/\sqrt{\Lambda}=O(\varepsilon^{\frac{5}{4}})$,$\Delta t=O(\varepsilon^{\frac{5}{4}})$如果$\varepsilon\leq \delta$;而如果$\delta<\varepsilon$,收敛需要$H/\sqrt{\Lambda}=O(\varepsilon^{\frac{1}{4}}\delta)$,$\Delta t=O(\frac{\delta^2}{\varepsilon^{3/4}})$(其中$H$表示粗元的最大直径,$\Lambda$是与辅助空间中不包含的特征向量相关的最小特征值,$\Delta t$是时间步长,$0 < \varepsilon\ll 1$是普朗克常数,$\delta$描述了势能的多尺度结构)。进行了几个数值示例,包括空间中的1D和2D,以及高对比度的势能,以证明所提出方案的效率和准确性。

In this paper, we study the Schr\"{o}dinger equation in the semiclassical regime and with multiscale potential function. We develop the so-called constraint energy minimization generalized multiscale finite element method (CEM-GMsFEM), in the framework of Crank-Nicolson (CN) discretization in time. The localized multiscale basis functions are constructed by addressing the spectral problem and a constrained energy minimization problem related to the Hamiltonian norm. A first-order convergence in the energy norm and second-order convergence in the $L^2$ norm for our numerical scheme are shown, with a relation between oversampling number in the CEM-GMsFEM method, spatial mesh size and the semiclassical parameter provided. Furthermore, we demonstrate the convergence of the proposed Crank-Nicolson CEM-GMsFEM scheme. The convergence requires $H/\sqrt{\Lambda}=O(\varepsilon^{\frac{5}{4}})$, $\Delta t=O(\varepsilon^{\frac{5}{4}})$ if $\varepsilon\leq \delta$; while if $\delta<\varepsilon$, the convergence requires $H/\sqrt{\Lambda}=O(\varepsilon^{\frac{1}{4}}\delta)$, $\Delta t=O(\frac{\delta^2}{\varepsilon^{3/4}})$ (where $H$ represents the maximum diameter of coarse elements, $\Lambda$ is the minimal eigenvalue associated with the eigenvector not included in the auxiliary space, $\Delta t$ is the time step, $0 < \varepsilon\ll 1$ is the Planck constant and $\delta$ describes the multiscale structure of the potential).Several numerical examples including 1D and 2D in space, with high-contrast potential are conducted to demonstrate the efficiency and accuracy of our proposed scheme.

[21] arXiv:2502.13445 (替换) [中文pdf, pdf, html, 其他]
标题: 基于四场公式的热-孔隙弹性高效迭代解耦方法
标题: An Efficient Iterative Decoupling Method for Thermo-Poroelasticity Based on a Four-Field Formulation
Mingchao Cai, Jingzhi Li, Ziliang Li, Qiang Liu
评论: 提交到期刊,被接受
主题: 数值分析 (math.NA)

本文研究了热-孔隙弹性模型。 通过引入一个中间变量,我们将原始的三场模型转化为四场模型。 在此四场模型的基础上,我们提出了一种耦合有限元方法和一种解耦迭代有限元方法。 我们证明了耦合有限元方法的稳定性与最优收敛性。 此外,我们建立了解耦迭代方法的收敛性。 本文主要专注于分析迭代解耦算法。 它表明该算法的收敛性不需要对物理参数或稳定化参数做出任何额外假设。 数值结果用于展示这些新方法的有效性和理论正确性。

This paper studies the thermo-poroelasticity model. By introducing an intermediate variable, we transform the original three-field model into a four-field model. Building upon this four-field model, we present both a coupled finite element method and a decoupled iterative finite element method. We prove the stability and optimal convergence of the coupled finite element method. Furthermore, we establish the convergence of the decoupled iterative method. This paper focuses primarily on analyzing the iterative decoupled algorithm. It demonstrates that the algorithm's convergence does not require any additional assumptions about physical parameters or stabilization parameters. Numerical results are provided to demonstrate the effectiveness and theoretical validity of these new methods.

[22] arXiv:2402.13670 (替换) [中文pdf, pdf, 其他]
标题: 黎曼凸包方法
标题: The Riemannian Convex Bundle Method
Ronny Bergmann, Roland Herzog, Hajg Jasa
主题: 优化与控制 (math.OC) ; 微分几何 (math.DG) ; 数值分析 (math.NA)

我们引入凸包方法来解决有界截面曲率的黎曼流形上的凸非光滑优化问题。 我们方法的每一步都基于一个模型,该模型涉及之前收集的次梯度的凸包,并平行传输到当前的严重迭代点。 这种方法推广了欧几里得空间中经典包子问题的对偶形式。 我们证明,在较弱的条件下,凸包方法收敛到一个最小值点。 使用Manopt$.$jl实现的几个数值示例展示了所提出方法的性能,并将其与次梯度法、循环邻近点算法以及邻近包方法进行了比较。

We introduce the convex bundle method to solve convex, non-smooth optimization problems on Riemannian manifolds of bounded sectional curvature. Each step of our method is based on a model that involves the convex hull of previously collected subgradients, parallelly transported into the current serious iterate. This approach generalizes the dual form of classical bundle subproblems in Euclidean space. We prove that, under mild conditions, the convex bundle method converges to a minimizer. Several numerical examples implemented using Manopt$.$jl illustrate the performance of the proposed method and compare it to the subgradient method, the cyclic proximal point algorithm, as well as the proximal bundle method.

[23] arXiv:2405.12182 (替换) [中文pdf, pdf, html, 其他]
标题: 最近邻GParareal:改进高斯过程在时间并行求解器中的可扩展性
标题: Nearest Neighbors GParareal: Improving Scalability of Gaussian Processes for Parallel-in-Time Solvers
Guglielmo Gattiglio, Lyudmila Grigoryeva, Massimiliano Tamborrino
主题: 计算 (stat.CO) ; 分布式、并行与集群计算 (cs.DC) ; 数值分析 (math.NA)

随着超级计算机的出现,多处理器环境和并行时间(PinT)算法为在长时间区间内求解常微分方程(ODE)和偏微分方程(PDE)的初始值问题提供了方法,这一任务通常在现实时间范围内使用顺序求解器无法完成。一种最近的方法,GParareal,将高斯过程与传统的PinT方法(Parareal)结合,以实现更快的并行加速。该方法已知在低维ODE和有限数量的计算机核心上优于Parareal。在此,我们提出了最近邻GParareal(nnGParareal),一种新颖的数据丰富PinT积分算法。nnGParareal在GParareal的基础上进行了改进,使其在高维系统和增加的处理器数量下具有更好的可扩展性。通过数据约简,模型复杂度从立方级降低到样本大小的对数线性级别,从而实现了在长时间区间内快速且自动化的初始值问题积分过程。首先,我们提供了误差的上界以及速度提升的理论细节。然后,我们通过实证展示了nnGParareal在九种不同系统上的优越性能,这些系统具有独特的特性(例如,刚性、混沌、高维或难以学习的系统)。

With the advent of supercomputers, multi-processor environments and parallel-in-time (PinT) algorithms offer ways to solve initial value problems for ordinary and partial differential equations (ODEs and PDEs) over long time intervals, a task often unfeasible with sequential solvers within realistic time frames. A recent approach, GParareal, combines Gaussian Processes with traditional PinT methodology (Parareal) to achieve faster parallel speed-ups. The method is known to outperform Parareal for low-dimensional ODEs and a limited number of computer cores. Here, we present Nearest Neighbors GParareal (nnGParareal), a novel data-enriched PinT integration algorithm. nnGParareal builds upon GParareal by improving its scalability properties for higher-dimensional systems and increased processor count. Through data reduction, the model complexity is reduced from cubic to log-linear in the sample size, yielding a fast and automated procedure to integrate initial value problems over long time intervals. First, we provide both an upper bound for the error and theoretical details on the speed-up benefits. Then, we empirically illustrate the superior performance of nnGParareal, compared to GParareal and Parareal, on nine different systems with unique features (e.g., stiff, chaotic, high-dimensional, or challenging-to-learn systems).

[24] arXiv:2409.00901 (替换) [中文pdf, pdf, html, 其他]
标题: 关于使用深度ReLU神经网络对Sobolev和Besov函数进行最优逼近的研究
标题: On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks
Yunfei Yang
主题: 机器学习 (stat.ML) ; 机器学习 (cs.LG) ; 数值分析 (math.NA)

本文研究了在误差以$L^p([0,1]^d)$范数衡量时,如何高效地用宽度为$W$、深度为$L$的深度 ReLU 神经网络近似 Sobolev 空间$\mathcal{W}^{s,q}([0,1]^d)$和 Besov 空间$\mathcal{B}^s_{q,r}([0,1]^d)$中的函数。 这个问题已被一些近期的研究工作所研究,在$p=q=\infty$时获得了近似率$\mathcal{O}((WL)^{-2s/d})$,考虑了对数因子,而在 Sobolev 嵌入条件$1/q -1/p<s/d$成立时,固定宽度的网络获得了速率$\mathcal{O}(L^{-2s/d})$。我们通过证明在 Sobolev 嵌入条件下的速率$\mathcal{O}((WL)^{-2s/d})$确实成立,从而推广了这些结果。已知该速率在对数因子范围内是最优的。我们证明中的关键工具是使用具有不同宽度和深度的深度 ReLU 神经网络对稀疏向量进行新颖的编码,这可能具有独立的兴趣。

This paper studies the problem of how efficiently functions in the Sobolev spaces $\mathcal{W}^{s,q}([0,1]^d)$ and Besov spaces $\mathcal{B}^s_{q,r}([0,1]^d)$ can be approximated by deep ReLU neural networks with width $W$ and depth $L$, when the error is measured in the $L^p([0,1]^d)$ norm. This problem has been studied by several recent works, which obtained the approximation rate $\mathcal{O}((WL)^{-2s/d})$ up to logarithmic factors when $p=q=\infty$, and the rate $\mathcal{O}(L^{-2s/d})$ for networks with fixed width when the Sobolev embedding condition $1/q -1/p<s/d$ holds. We generalize these results by showing that the rate $\mathcal{O}((WL)^{-2s/d})$ indeed holds under the Sobolev embedding condition. It is known that this rate is optimal up to logarithmic factors. The key tool in our proof is a novel encoding of sparse vectors by using deep ReLU neural networks with varied width and depth, which may be of independent interest.

[25] arXiv:2412.05144 (替换) [中文pdf, pdf, html, 其他]
标题: $ε$-阶和阶梯现象:对神经网络训练动态的新见解
标题: $ε$-rank and the Staircase Phenomenon: New Insights into Neural Network Training Dynamics
Jiang Yang, Yuxiang Zhao, Quanhui Zhu
主题: 机器学习 (cs.LG) ; 数值分析 (math.NA)

理解深度神经网络 (DNN) 的训练动态,尤其是它们如何从高维数据演化出低维特征,仍然是深度学习理论的核心挑战。 本文引入了 $\epsilon$-rank 的概念,这是一个量化终端隐藏层神经元函数有效特征的全新指标。 通过对不同任务的广泛实验,我们观察到一种普遍存在的阶梯现象:在使用标准随机梯度下降方法进行训练的过程中,损失函数的下降伴随着 $\epsilon$-rank 的上升,并呈现出阶梯状模式。 从理论上,我们严格证明了损失下限与 $\epsilon$-rank 之间存在负相关性,表明较高的 $\epsilon$-rank 对于显著降低损失至关重要。 此外,数值证据表明,在同一个深度神经网络中,后续隐藏层的 $\epsilon$-rank 高于前一个隐藏层。 基于这些观察,为了消除阶梯现象,我们提出了一种新颖的初始隐藏层预训练策略,以提高终端隐藏层的$\epsilon$秩。 数值实验验证了该策略在缩短训练时间并提高各种任务的准确率方面的有效性。 因此,新引入的$\epsilon$秩概念是一个可计算的量,可作为深度神经网络的内在有效度量特征,为理解神经网络的训练动态提供了一个新的视角,并为在实际应用中设计高效的训练策略提供了理论基础。

Understanding the training dynamics of deep neural networks (DNNs), particularly how they evolve low-dimensional features from high-dimensional data, remains a central challenge in deep learning theory. In this work, we introduce the concept of $\epsilon$-rank, a novel metric quantifying the effective feature of neuron functions in the terminal hidden layer. Through extensive experiments across diverse tasks, we observe a universal staircase phenomenon: during training process implemented by the standard stochastic gradient descent methods, the decline of the loss function is accompanied by an increase in the $\epsilon$-rank and exhibits a staircase pattern. Theoretically, we rigorously prove a negative correlation between the loss lower bound and $\epsilon$-rank, demonstrating that a high $\epsilon$-rank is essential for significant loss reduction. Moreover, numerical evidences show that within the same deep neural network, the $\epsilon$-rank of the subsequent hidden layer is higher than that of the previous hidden layer. Based on these observations, to eliminate the staircase phenomenon, we propose a novel pre-training strategy on the initial hidden layer that elevates the $\epsilon$-rank of the terminal hidden layer. Numerical experiments validate its effectiveness in reducing training time and improving accuracy across various tasks. Therefore, the newly introduced concept of $\epsilon$-rank is a computable quantity that serves as an intrinsic effective metric characteristic for deep neural networks, providing a novel perspective for understanding the training dynamics of neural networks and offering a theoretical foundation for designing efficient training strategies in practical applications.

总共 25 条目
显示最多 2000 每页条目: 较少 | 更多 | 所有
  • 关于
  • 帮助
  • contact arXivClick here to contact arXiv 联系
  • 订阅 arXiv 邮件列表点击这里订阅 订阅
  • 版权
  • 隐私政策
  • 网络无障碍帮助
  • arXiv 运营状态
    通过...获取状态通知 email 或者 slack

京ICP备2025123034号