Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2402.05223v2

Help | Advanced Search

Computer Science > Software Engineering

arXiv:2402.05223v2 (cs)
[Submitted on 7 Feb 2024 (v1) , revised 23 Sep 2024 (this version, v2) , latest version 28 Sep 2024 (v3) ]

Title: Taming Timeout Flakiness: An Empirical Study of SAP HANA

Title: 消除超时不稳定性的研究:SAP HANA的实证研究

Authors:Alexander Berndt, Sebastian Baltes, Thomas Bach
Abstract: Regression testing aims to prevent code changes from breaking existing features. Flaky tests negatively affect regression testing because they result in test failures that are not necessarily caused by code changes, thus providing an ambiguous signal. Test timeouts are one contributing factor to such flaky test failures. With the goal of reducing test flakiness in SAP HANA, we empirically study the impact of test timeouts on flakiness in system tests. We evaluate different approaches to automatically adjust timeout values, assessing their suitability for reducing execution time costs and improving build turnaround times. We collect metadata on SAP HANA's test executions by repeatedly executing tests on the same code revision over a period of six months. We analyze the test flakiness rate, investigate the evolution of test timeout values, and evaluate different approaches for optimizing timeout values. The test flakiness rate ranges from 49% to 70%, depending on the number of repeated test executions. Test timeouts account for 70% of flaky test failures. Developers typically react to flaky timeouts by manually increasing timeout values or splitting long-running tests. However, manually adjusting timeout values is a tedious task. Our approach for timeout optimization reduces timeout-related flaky failures by 80% and reduces the overall median timeout value by 25%, i.e., blocked tests are identified faster. Test timeouts are a major contributing factor to flakiness in system tests. It is challenging for developers to effectively mitigate this problem manually. Our technique for optimizing timeout values reduces flaky failures while minimizing test costs. Practitioners working on large-scale industrial software systems can use our findings to increase the effectiveness of their system tests while reducing the burden on developers to manually maintain appropriate timeout values.
Abstract: 回归测试旨在防止代码更改破坏现有功能。 易碎测试对回归测试产生负面影响,因为它们会导致测试失败,而这些失败不一定是由代码更改引起的,从而提供了一个模糊的信号。 测试超时是导致此类易碎测试失败的一个因素。 为了减少SAP HANA中的测试易碎性,我们通过实证研究测试超时对系统测试中易碎性的影响。 我们评估了不同的自动调整超时值的方法,评估它们在减少执行时间成本和提高构建周转时间方面的适用性。 我们通过在六个月内对同一代码修订版重复执行测试,收集了SAP HANA测试执行的元数据。 我们分析了测试易碎率,研究了测试超时值的演变,并评估了不同的优化超时值的方法。 测试易碎率根据重复测试执行的次数从49%到70%不等。 测试超时占易碎测试失败的70%。 开发人员通常通过手动增加超时值或拆分运行时间较长的测试来应对易碎的超时问题。 然而,手动调整超时值是一项繁琐的任务。 我们的超时优化方法将与超时相关的易碎失败减少了80%,并将总体中位数超时值减少了25%,即被阻塞的测试能更快地被识别。 测试超时是系统测试中易碎性的主要因素。 对于开发人员来说,手动有效缓解这个问题具有挑战性。 我们的优化超时值的技术在最小化测试成本的同时减少了易碎失败。 在大型工业软件系统上工作的实践者可以利用我们的发现,在提高系统测试有效性的同时,减少开发人员手动维护适当超时值的负担。
Comments: 12 pages, 9 figures, 3 tables, Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice (ICSE SEIP 2024)
Subjects: Software Engineering (cs.SE)
Cite as: arXiv:2402.05223 [cs.SE]
  (or arXiv:2402.05223v2 [cs.SE] for this version)
  https://doi.org/10.48550/arXiv.2402.05223
arXiv-issued DOI via DataCite

Submission history

From: Sebastian Baltes [view email]
[v1] Wed, 7 Feb 2024 20:01:41 UTC (1,795 KB)
[v2] Mon, 23 Sep 2024 12:37:07 UTC (669 KB)
[v3] Sat, 28 Sep 2024 11:12:40 UTC (669 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • Other Formats
view license
Current browse context:
cs.SE
< prev   |   next >
new | recent | 2024-02
Change to browse by:
cs

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号