A non-ergodic framework for understanding emergent capabilities in Large Language Models

Marín, Javier

计算机科学 > 计算与语言

arXiv:2501.01638 (cs)

[提交于 2025年1月3日 (v1) ，最后修订 2025年2月28日 (此版本， v2)]

标题：大型语言模型中涌现能力的非遍历性框架

标题： A non-ergodic framework for understanding emergent capabilities in Large Language Models

Authors:Javier Marín

摘要：大规模语言模型具有在规模扩大时意外出现的能力，但我们需要一个理论框架来解释它们为何以及如何出现。我们证明语言模型实际上是非遍历系统，并提供了一个基于斯图尔特·卡夫曼的相邻可能理论（TAP）的数学框架来解释能力的出现。我们的资源受限的TAP方程展示了架构、训练和上下文约束如何通过语义空间中的相变来塑造模型能力。我们通过三种不同的语言模型实验证明，能力是通过约束相互作用和路径依赖性探索引导的离散转换出现的。这个框架为理解语言模型中的涌现提供了理论基础，并指导了能够引导能力涌现的架构的发展。

摘要： Large language models have emergent capabilities that come unexpectedly at scale, but we need a theoretical framework to explain why and how they emerge. We prove that language models are actually non-ergodic systems while providing a mathematical framework based on Stuart Kauffman's theory of the adjacent possible (TAP) to explain capability emergence. Our resource-constrained TAP equation demonstrates how architectural, training, and contextual constraints interact to shape model capabilities through phase transitions in semantic space. We prove through experiments with three different language models that capacities emerge through discrete transitions guided by constraint interactions and path-dependent exploration. This framework provides a theoretical basis for understanding emergence in language models and guides the development of architectures that can guide capability emergence.

主题：	计算与语言 (cs.CL) ; 人工智能 (cs.AI); 机器学习 (cs.LG)
引用方式：	arXiv:2501.01638 [cs.CL]
	(或者 arXiv:2501.01638v2 [cs.CL] 对于此版本)
	https://doi.org/10.48550/arXiv.2501.01638

提交历史

来自： Javier Marin [查看电子邮件]
[v1] 星期五， 2025 年 1 月 3 日 05:11:41 UTC (1,093 KB)
[v2] 星期五， 2025 年 2 月 28 日 08:07:50 UTC (1,168 KB)

计算机科学 > 计算与语言

标题：大型语言模型中涌现能力的非遍历性框架

标题： A non-ergodic framework for understanding emergent capabilities in Large Language Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算与语言

标题： 大型语言模型中涌现能力的非遍历性框架 显示英文标题

标题： A non-ergodic framework for understanding emergent capabilities in Large Language Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：大型语言模型中涌现能力的非遍历性框架