Think When You Need: Self-Adaptive Chain-of-Thought Learning

Yang, Junjie; Lin, Ke; Yu, Xing

Computer Science > Computation and Language

arXiv:2504.03234 (cs)

[Submitted on 4 Apr 2025 (v1) , last revised 21 May 2025 (this version, v2)]

Title: Think When You Need: Self-Adaptive Chain-of-Thought Learning

Title: 思考当你需要时：自适应链式思维学习

Authors:Junjie Yang, Ke Lin, Xing Yu

Abstract: Chain of Thought (CoT) reasoning enhances language models' performance but often leads to inefficient "overthinking" on simple problems. We identify that existing approaches directly penalizing reasoning length fail to account for varying problem complexity. Our approach constructs rewards through length and quality comparisons, guided by theoretical assumptions that jointly enhance solution correctness with conciseness. Moreover, we further demonstrate our method to fuzzy tasks where ground truth is unavailable. Experiments across multiple reasoning benchmarks demonstrate that our method maintains accuracy while generating significantly more concise explanations, effectively teaching models to "think when needed."

Abstract: 链式思维（CoT）推理提高了语言模型的性能，但往往会导致在简单问题上的低效“过度思考”。我们发现现有的直接惩罚推理长度的方法未能考虑到问题复杂性的差异。我们的方法通过长度和质量的比较来构建奖励，以理论假设为指导，这些假设共同提升了解决方案的正确性和简洁性。此外，我们还展示了该方法在无真实答案模糊任务中的应用。多项推理基准实验表明，我们的方法在保持准确率的同时生成了显著更简洁的解释，有效地教会了模型在必要时“思考”。

Comments:	Under review
Subjects:	Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2504.03234 [cs.CL]
	(or arXiv:2504.03234v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.03234

Submission history

From: Junjie Yang [view email]
[v1] Fri, 4 Apr 2025 07:34:01 UTC (2,726 KB)
[v2] Wed, 21 May 2025 15:26:54 UTC (11,895 KB)

Computer Science > Computation and Language

Title: Think When You Need: Self-Adaptive Chain-of-Thought Learning

Title: 思考当你需要时：自适应链式思维学习

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title: Think When You Need: Self-Adaptive Chain-of-Thought Learning Show Chinese title

Title: 思考当你需要时：自适应链式思维学习

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Think When You Need: Self-Adaptive Chain-of-Thought Learning