计算机科学 > 软件工程

arXiv:2507.23640v1 (cs)

[提交于 2025年7月31日 ]

标题：对合并请求接受所需更改量的实证研究

标题： An Empirical Study on the Amount of Changes Required for Merge Request Acceptance

Authors:Samah Kansab, Mohammed Sayagh, Francis Bordeleau, Ali Tizghadam

摘要：代码审查（CR）是软件开发中的重要环节，有助于确保新代码正确集成。然而，CR过程通常需要大量努力，包括代码调整、对审阅者的回应以及持续的实现。尽管过去的研究已经探讨了CR延迟和迭代次数，但很少有研究基于所需的代码变更量来调查工作量，尤其是在GitLab合并请求（MR）的背景下，这仍是一个研究不足的领域。在本文中，我们定义并衡量CR工作量为提交后修改的代码量，使用来自四个GitLab项目的超过23,600个MR的数据集。我们发现，多达71%的MR在提交后需要调整，其中28%的调整涉及超过200行代码的变化。令人惊讶的是，这种工作量与评审时间和参与人数无关。为了更好地理解和预测CR工作量，我们使用多个维度的指标训练了一个可解释的机器学习模型：文本特征、代码复杂性、开发人员经验、评审历史和分支策略。我们的模型表现出良好的性能（AUC 0.84-0.88），并揭示了复杂性、经验和文本特征是关键预测因素。历史项目特征也会影响当前的评审工作量。我们的研究结果突显了利用机器学习来解释和预测评审过程中整合代码变更所需工作量的可行性。

摘要： Code review (CR) is essential to software development, helping ensure that new code is properly integrated. However, the CR process often involves significant effort, including code adjustments, responses to reviewers, and continued implementation. While past studies have examined CR delays and iteration counts, few have investigated the effort based on the volume of code changes required, especially in the context of GitLab Merge Requests (MRs), which remains underexplored. In this paper, we define and measure CR effort as the amount of code modified after submission, using a dataset of over 23,600 MRs from four GitLab projects. We find that up to 71% of MRs require adjustments after submission, and 28% of these involve changes to more than 200 lines of code. Surprisingly, this effort is not correlated with review time or the number of participants. To better understand and predict CR effort, we train an interpretable machine learning model using metrics across multiple dimensions: text features, code complexity, developer experience, review history, and branching. Our model achieves strong performance (AUC 0.84-0.88) and reveals that complexity, experience, and text features are key predictors. Historical project characteristics also influence current review effort. Our findings highlight the feasibility of using machine learning to explain and anticipate the effort needed to integrate code changes during review.

主题：	软件工程 (cs.SE)
引用方式：	arXiv:2507.23640 [cs.SE]
	(或者 arXiv:2507.23640v1 [cs.SE] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.23640

提交历史

来自： Samah Kansab [查看电子邮件]
[v1] 星期四， 2025 年 7 月 31 日 15:18:46 UTC (1,062 KB)

计算机科学 > 软件工程

标题：对合并请求接受所需更改量的实证研究

标题： An Empirical Study on the Amount of Changes Required for Merge Request Acceptance

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 软件工程

标题： 对合并请求接受所需更改量的实证研究 显示英文标题

标题： An Empirical Study on the Amount of Changes Required for Merge Request Acceptance

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：对合并请求接受所需更改量的实证研究