Intrinsic Dimension Estimating Autoencoder (IDEA) Using CancelOut Layer and a Projected Loss

Oriou, Antoine; Krah, Philipp; Koellermeier, Julian

Computer Science > Machine Learning

arXiv:2509.10011 (cs)

[Submitted on 12 Sep 2025 (v1) , last revised 15 Sep 2025 (this version, v2)]

Title: Intrinsic Dimension Estimating Autoencoder (IDEA) Using CancelOut Layer and a Projected Loss

Title: 内在维度估计自编码器（IDEA）使用CancelOut层和投影损失

Authors:Antoine Oriou, Philipp Krah, Julian Koellermeier

Abstract: This paper introduces the Intrinsic Dimension Estimating Autoencoder (IDEA), which identifies the underlying intrinsic dimension of a wide range of datasets whose samples lie on either linear or nonlinear manifolds. Beyond estimating the intrinsic dimension, IDEA is also able to reconstruct the original dataset after projecting it onto the corresponding latent space, which is structured using re-weighted double CancelOut layers. Our key contribution is the introduction of the projected reconstruction loss term, guiding the training of the model by continuously assessing the reconstruction quality under the removal of an additional latent dimension. We first assess the performance of IDEA on a series of theoretical benchmarks to validate its robustness. These experiments allow us to test its reconstruction ability and compare its performance with state-of-the-art intrinsic dimension estimators. The benchmarks show good accuracy and high versatility of our approach. Subsequently, we apply our model to data generated from the numerical solution of a vertically resolved one-dimensional free-surface flow, following a pointwise discretization of the vertical velocity profile in the horizontal direction, vertical direction, and time. IDEA succeeds in estimating the dataset's intrinsic dimension and then reconstructs the original solution by working directly within the projection space identified by the network.

Abstract: 本文介绍了内在维度估计自编码器（IDEA），该方法能够识别样本位于线性或非线性流形上的各种数据集的潜在内在维度。除了估计内在维度外，IDEA 还能够在将数据集投影到相应的潜在空间后重建原始数据集，该潜在空间使用重新加权的双 CancelOut 层构建。我们的主要贡献是引入了投影重建损失项，通过在移除一个额外潜在维度的情况下持续评估重建质量来指导模型的训练。我们首先在一系列理论基准上评估 IDEA 的性能，以验证其鲁棒性。这些实验使我们能够测试其重建能力，并将其性能与最先进的内在维度估计器进行比较。基准测试显示了我们方法的良好准确性和高通用性。随后，我们将模型应用于从垂直分辨的一维自由表面流数值解中生成的数据，这些数据遵循水平方向、垂直方向和时间上的点态离散化。 IDEA 成功地估计了数据集的内在维度，并通过直接在由网络确定的投影空间内工作来重建原始解。

Comments:	Preprint with 12 pages and 12 figures
Subjects:	Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
Cite as:	arXiv:2509.10011 [cs.LG]
	(or arXiv:2509.10011v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2509.10011

Submission history

From: Philipp Krah [view email]
[v1] Fri, 12 Sep 2025 07:11:05 UTC (347 KB)
[v2] Mon, 15 Sep 2025 08:02:19 UTC (347 KB)

Computer Science > Machine Learning

Title: Intrinsic Dimension Estimating Autoencoder (IDEA) Using CancelOut Layer and a Projected Loss

Title: 内在维度估计自编码器（IDEA）使用CancelOut层和投影损失

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title: Intrinsic Dimension Estimating Autoencoder (IDEA) Using CancelOut Layer and a Projected Loss Show Chinese title

Title: 内在维度估计自编码器（IDEA）使用CancelOut层和投影损失

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Intrinsic Dimension Estimating Autoencoder (IDEA) Using CancelOut Layer and a Projected Loss