The Exploratory Study on the Relationship Between the Failure of Distance Metrics in High-Dimensional Space and Emergent Phenomena

Liu, HongZheng; Tian, YiNuo; Wu, Zhiyue

计算机科学 > 信息论

arXiv:2504.08807 (cs)

[提交于 2025年4月9日 ]

标题：关于高维空间中距离度量失效与涌现现象之间关系的探索性研究

标题： The Exploratory Study on the Relationship Between the Failure of Distance Metrics in High-Dimensional Space and Emergent Phenomena

Authors:HongZheng Liu, YiNuo Tian, Zhiyue Wu

摘要：本文提出一个统一框架，整合信息论与统计力学，将高维数据的度量失效与复杂系统的涌现现象联系起来。我们提出了“信息稀释定理”，表明随着维度（$d$）增加，几何度量（例如欧几里得距离）与系统状态之间的互信息效率衰减大约为$O(1/d)$。这种衰减源于系统熵线性增长与度量熵次线性增长之间的不匹配，解释了距离集中性的机制。在此基础上，我们基于互信息矩阵谱和信息瓶颈理论推导出交互编码能力（$C'$），引入了基于信息结构复杂度（$C(S)$）。 “涌现临界定理”指出，当$C(S)$超过$C'$时，必然会出现新的全局特征，满足预定义的互信息阈值。这为自组织和相变提供了操作标准。我们讨论了其在物理、生物和深度学习中的潜在应用，提出了像基于MI的流形学习（UMAP+）等可能方向，并为跨学科分析涌现现象提供了定量基础。

摘要： This paper presents a unified framework, integrating information theory and statistical mechanics, to connect metric failure in high-dimensional data with emergence in complex systems. We propose the "Information Dilution Theorem," demonstrating that as dimensionality ($d$) increases, the mutual information efficiency between geometric metrics (e.g., Euclidean distance) and system states decays approximately as $O(1/d)$. This decay arises from the mismatch between linearly growing system entropy and sublinearly growing metric entropy, explaining the mechanism behind distance concentration. Building on this, we introduce information structural complexity ($C(S)$) based on the mutual information matrix spectrum and interaction encoding capacity ($C'$) derived from information bottleneck theory. The "Emergence Critical Theorem" states that when $C(S)$ exceeds $C'$, new global features inevitably emerge, satisfying a predefined mutual information threshold. This provides an operational criterion for self-organization and phase transitions. We discuss potential applications in physics, biology, and deep learning, suggesting potential directions like MI-based manifold learning (UMAP+) and offering a quantitative foundation for analyzing emergence across disciplines.

主题：	信息论 (cs.IT) ; 统计力学 (cond-mat.stat-mech); 适应性与自组织系统 (nlin.AO)
引用方式：	arXiv:2504.08807 [cs.IT]
	(或者 arXiv:2504.08807v1 [cs.IT] 对于此版本)
	https://doi.org/10.48550/arXiv.2504.08807

提交历史

来自： Zhiyue Wu [查看电子邮件]
[v1] 星期三， 2025 年 4 月 9 日 02:19:58 UTC (65 KB)

计算机科学 > 信息论

标题：关于高维空间中距离度量失效与涌现现象之间关系的探索性研究

标题： The Exploratory Study on the Relationship Between the Failure of Distance Metrics in High-Dimensional Space and Emergent Phenomena

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 信息论

标题： 关于高维空间中距离度量失效与涌现现象之间关系的探索性研究 显示英文标题

标题： The Exploratory Study on the Relationship Between the Failure of Distance Metrics in High-Dimensional Space and Emergent Phenomena

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：关于高维空间中距离度量失效与涌现现象之间关系的探索性研究