Input Reduction Enhanced LLM-based Program Repair

Yang, Boyang; Ren, Luyao; Yin, Xin; Ren, Jiadong; Tian, Haoye; Jin, Shunfu

计算机科学 > 软件工程

arXiv:2507.15251v1 (cs)

[提交于 2025年7月21日 ]

标题：输入简化增强的基于大语言模型的程序修复

标题： Input Reduction Enhanced LLM-based Program Repair

Authors:Boyang Yang, Luyao Ren, Xin Yin, Jiadong Ren, Haoye Tian, Shunfu Jin

摘要：大型语言模型（LLMs）在自动化程序修复（APR）中展现出巨大潜力。测试输入对于推理失败的根本原因至关重要，总是包含在基于LLM的APR的提示中。不幸的是， LLM难以在长提示中保留关键信息。当测试输入在提示中较多时，这可能会引发“中间丢失”问题，影响修复性能。为了解决这个问题，我们提出了ReduceFix，这是一种基于LLM的APR方法，内置一个组件可以自动减少测试输入同时保留其导致失败的行为。 ReduceFix提示LLM生成一个缩减器，以最小化导致失败的测试输入而无需人工努力，然后将缩减后的导致失败的输入提供给补丁生成。为了有针对性的评估，我们构建了LFTBench，这是第一个具有200个真实错误的长输入APR基准，来自20个编程任务，每个任务都配有一个导致失败的输入，其中位数大小为1 MB。在这个基准上，ReduceFix平均将输入减少89.1%，与包含原始测试的提示相比，整体pass@10提高了最多53.8%，与完全省略测试相比提高了17.6%。将相同的缩减步骤添加到ChatRepair中，在不进行其他更改的情况下，其修复率提高了21.3%。消融研究进一步突显了输入长度和压缩的失败信息对修复成功率的影响。这些结果强调，自动减少失败输入是基于LLM的APR的一个实用且强大的补充，显著提高了其可扩展性和有效性。

摘要： Large Language Models (LLMs) have shown great potential in Automated Program Repair (APR). Test inputs, being crucial for reasoning the root cause of failures, are always included in the prompt for LLM-based APR. Unfortunately, LLMs struggle to retain key information in long prompts. When the test inputs are extensive in the prompt, this may trigger the "lost-in-the-middle" issue, compromising repair performance. To address this, we propose ReduceFix, an LLM-based APR approach with a built-in component that automatically reduces test inputs while retaining their failure-inducing behavior. ReduceFix prompts an LLM to generate a reducer that minimizes failure-inducing test inputs without human effort, and then feeds the reduced failure-inducing inputs to guide patch generation. For targeted evaluation, we constructed LFTBench, the first long-input APR benchmark with 200 real bugs from 20 programming tasks, each paired with a failure-inducing input whose median size is 1 MB. On this benchmark, ReduceFix shrinks inputs by 89.1% on average and improves overall pass@10 by up to 53.8% relative to a prompt that includes the original test, and by 17.6% compared with omitting the test entirely. Adding the same reduction step to ChatRepair increases its fix rate by 21.3% without other changes. Ablation studies further highlight the impact of input length and compressed failure information on repair success. These results underscore that automatically reducing failing inputs is a practical and powerful complement to LLM-based APR, significantly improving its scalability and effectiveness.

主题：	软件工程 (cs.SE)
引用方式：	arXiv:2507.15251 [cs.SE]
	(或者 arXiv:2507.15251v1 [cs.SE] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.15251

提交历史

来自： Boyang Yang [查看电子邮件]
[v1] 星期一， 2025 年 7 月 21 日 05:26:32 UTC (557 KB)

计算机科学 > 软件工程

标题：输入简化增强的基于大语言模型的程序修复

标题： Input Reduction Enhanced LLM-based Program Repair

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 软件工程

标题： 输入简化增强的基于大语言模型的程序修复 显示英文标题

标题： Input Reduction Enhanced LLM-based Program Repair

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：输入简化增强的基于大语言模型的程序修复