Answer Convergence as a Signal for Early Stopping in Reasoning

Liu, Xin; Wang, Lu

计算机科学 > 计算与语言

arXiv:2506.02536 (cs)

[提交于 2025年6月3日 ]

标题：答案收敛作为推理中早停的信号

标题： Answer Convergence as a Signal for Early Stopping in Reasoning

Authors:Xin Liu, Lu Wang

摘要：思维链（CoT）提示增强了大型语言模型（LLMs）的推理能力，但常常导致冗长和冗余的输出，从而增加了推理成本。我们假设许多推理步骤对于产生正确答案来说是不必要的。为了验证这一点，我们首先进行了一项系统研究，以检查模型达到稳定决策所需的最少推理量。我们发现，在数学推理任务中，模型通常在60%的推理步骤后收敛到最终答案，这表明剩余内容存在大量冗余。基于这些见解，我们提出了三种推理时策略来提高效率：（1）通过答案一致性实现早期停止，（2）增强生成推理结束信号的概率，以及（3）一种基于内部激活的学习何时停止的监督方法。在五个基准数据集和五种开源权重的LLMs上的实验表明，我们的方法显著减少了令牌使用量，同时几乎没有或没有精度损失。特别是，在NaturalQuestions上，答案一致性减少了超过40%的令牌，同时进一步提高了准确性。我们的工作强调了在推理时运行的成本效益推理方法的重要性，为实际应用提供了实用价值。

摘要： Chain-of-thought (CoT) prompting enhances reasoning in large language models (LLMs) but often leads to verbose and redundant outputs, thus increasing inference cost. We hypothesize that many reasoning steps are unnecessary for producing correct answers. To investigate this, we start with a systematic study to examine what is the minimum reasoning required for a model to reach a stable decision. We find that on math reasoning tasks like math, models typically converge to their final answers after 60\% of the reasoning steps, suggesting substantial redundancy in the remaining content. Based on these insights, we propose three inference-time strategies to improve efficiency: (1) early stopping via answer consistency, (2) boosting the probability of generating end-of-reasoning signals, and (3) a supervised method that learns when to stop based on internal activations. Experiments across five benchmarks and five open-weights LLMs show that our methods significantly reduce token usage with little or no accuracy drop. In particular, on NaturalQuestions, Answer Consistency reduces tokens by over 40\% while further improving accuracy. Our work underscores the importance of cost-effective reasoning methods that operate at inference time, offering practical benefits for real-world applications.

主题：	计算与语言 (cs.CL)
引用方式：	arXiv:2506.02536 [cs.CL]
	(或者 arXiv:2506.02536v1 [cs.CL] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.02536

提交历史

来自： Xin Liu [查看电子邮件]
[v1] 星期二， 2025 年 6 月 3 日 07:20:54 UTC (732 KB)

计算机科学 > 计算与语言

标题：答案收敛作为推理中早停的信号

标题： Answer Convergence as a Signal for Early Stopping in Reasoning

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算与语言

标题： 答案收敛作为推理中早停的信号 显示英文标题

标题： Answer Convergence as a Signal for Early Stopping in Reasoning

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：答案收敛作为推理中早停的信号