On the Feasibility of Deduplicating Compiler Bugs with Bisection

Zhou, Xintong; Xu, Zhenyang; Sun, Chengnian

计算机科学 > 软件工程

arXiv:2506.23281 (cs)

[提交于 2025年6月29日 ]

标题：关于使用二分法消除编译器错误的可行性

标题： On the Feasibility of Deduplicating Compiler Bugs with Bisection

Authors:Xintong Zhou, Zhenyang Xu, Chengnian Sun

摘要：随机测试已被证明是编译器验证的有效技术。然而，由于重复测试程序频繁出现并暴露相同的编译器错误，通过随机测试发现的错误的调试提出了重大挑战。识别重复项的过程是一个实际的研究问题，称为错误去重。先前的编译器错误去重方法主要依赖程序分析来提取与错误相关的特征以进行重复识别，这可能导致大量的计算开销和有限的通用性。本文研究了采用二分法（一种在先前编译器错误去重研究中被忽视的标准调试过程）的可行性。我们的研究表明，利用二分法定位导致失败的提交提供了一个有价值的去重标准，尽管需要补充技术以更准确地识别。基于这些结果，我们引入了BugLens，这是一种新的去重方法，主要使用二分法，并通过识别触发错误的优化来减少假阴性。在四个真实数据集上进行的实证评估表明， BugLens在识别相同数量的不同错误时，平均节省了26.98%和9.64%的人工努力，显著优于最先进的基于分析的方法Tamer和D3。鉴于二分法的固有简单性和通用性，它为现实应用中的编译器错误去重提供了一个非常实用的解决方案。

摘要： Random testing has proven to be an effective technique for compiler validation. However, the debugging of bugs identified through random testing presents a significant challenge due to the frequent occurrence of duplicate test programs that expose identical compiler bugs. The process to identify duplicates is a practical research problem known as bug deduplication. Prior methodologies for compiler bug deduplication primarily rely on program analysis to extract bug-related features for duplicate identification, which can result in substantial computational overhead and limited generalizability. This paper investigates the feasibility of employing bisection, a standard debugging procedure largely overlooked in prior research on compiler bug deduplication, for this purpose. Our study demonstrates that the utilization of bisection to locate failure-inducing commits provides a valuable criterion for deduplication, albeit one that requires supplementary techniques for more accurate identification. Building on these results, we introduce BugLens, a novel deduplication method that primarily uses bisection, enhanced by the identification of bug-triggering optimizations to minimize false negatives. Empirical evaluations conducted on four real-world datasets demonstrate that BugLens significantly outperforms the state-of-the-art analysis-based methodologies Tamer and D3 by saving an average of 26.98% and 9.64% human effort to identify the same number of distinct bugs. Given the inherent simplicity and generalizability of bisection, it presents a highly practical solution for compiler bug deduplication in real-world applications.

主题：	软件工程 (cs.SE) ; 编程语言 (cs.PL)
引用方式：	arXiv:2506.23281 [cs.SE]
	(或者 arXiv:2506.23281v1 [cs.SE] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.23281

提交历史

来自： Xintong Zhou [查看电子邮件]
[v1] 星期日， 2025 年 6 月 29 日 15:12:57 UTC (415 KB)

计算机科学 > 软件工程

标题：关于使用二分法消除编译器错误的可行性

标题： On the Feasibility of Deduplicating Compiler Bugs with Bisection

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 软件工程

标题： 关于使用二分法消除编译器错误的可行性 显示英文标题

标题： On the Feasibility of Deduplicating Compiler Bugs with Bisection

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：关于使用二分法消除编译器错误的可行性