Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling

Trae Research Team; Gao, Pengfei; Tian, Zhao; Meng, Xiangxin; Wang, Xinchen; Hu, Ruida; Xiao, Yuanan; Liu, Yizhou; Zhang, Zhao; Chen, Junjie; Gao, Cuiyun; Lin, Yun; Xiong, Yingfei; Peng, Chao; Liu, Xia

计算机科学 > 软件工程

arXiv:2507.23370 (cs)

[提交于 2025年7月31日 ]

标题： Trae代理：一种基于大语言模型的软件工程代理，具有测试时缩放功能

标题： Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling

Authors:Trae Research Team: Pengfei Gao, Zhao Tian, Xiangxin Meng, Xinchen Wang, Ruida Hu, Yuanan Xiao, Yizhou Liu, Zhao Zhang, Junjie Chen, Cuiyun Gao, Yun Lin, Yingfei Xiong, Chao Peng, Xia Liu

摘要：软件问题解决是软件工程中的一个关键挑战，近年来引起了越来越多的关注。随着大型语言模型（LLMs）的快速发展，已经在解决现实世界的软件工程任务方面取得了显著进展。最近的研究引入了集成推理技术，以提高基于LLM的问题解决性能。然而，现有的基于提示的方法在有效探索大型集成空间方面仍然存在局限，并且缺乏仓库级别的理解能力，这两方面都限制了它们的整体效果。在本文中，我们提出了Trae Agent，这是第一个基于代理的集成推理方法，用于仓库级别的问题解决。Trae Agent将我们的目标表述为一个最优解搜索问题，并通过生成、剪枝和选择的模块化代理来解决两个关键挑战，即大型集成空间和仓库级别理解。我们在广泛采用的SWE-bench基准上使用三种领先的LLM进行了广泛的实验，将Trae Agent与四种最先进的集成推理技术进行比较。实验结果表明，Trae Agent在Pass@1指标上相对于所有基线平均提高了10.22%，表现优异。Trae Agent在SWE-bench Verified排行榜上获得了第一名，具有显著的Pass@1得分为75.20%。我们很高兴将Trae Agent作为开源项目发布，以支持研究社区，所有资源均可在https://github.com/bytedance/trae-agent获取。

摘要： Software issue resolution is a critical challenge in software engineering and has garnered increasing attention in recent years. With the rapid advancement of large language models (LLMs), substantial progress has been made in addressing real-world software engineering tasks. Recent studies have introduced ensemble reasoning techniques to enhance the performance of LLM-based issue resolution. However, existing prompting-based methods still face limitations in effectively exploring large ensemble spaces and lack the capacity for repository-level understanding, both of which constrain their overall effectiveness. In this paper, we propose Trae Agent, the first agent-based ensemble reasoning approach for repository-level issue resolution. Trae Agent formulates our goal as an optimal solution search problem and addresses two key challenges, i.e., large ensemble spaces and repository-level understanding, through modular agents for generation, pruning, and selection. We conduct extensive experiments using three leading LLMs on the widely-adopted SWE-bench benchmark, comparing Trae Agent against four state-of-the-art ensemble reasoning techniques. Experimental results demonstrate that Trae Agent consistently achieves superior performance, with an average improvement of 10.22% over all baselines in terms of Pass@1. Trae Agent has achieved first place on the SWE-bench Verified leaderboard, with a notable Pass@1 score of 75.20%. We are pleased to release Trae Agent as an open-source project to support the research community, with all resources available at https://github.com/bytedance/trae-agent.

评论：	高鹏飞和田兆对本技术报告做出了同等贡献
主题：	软件工程 (cs.SE) ; 人工智能 (cs.AI)
引用方式：	arXiv:2507.23370 [cs.SE]
	(或者 arXiv:2507.23370v1 [cs.SE] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.23370

提交历史

来自： Chao Peng [查看电子邮件]
[v1] 星期四， 2025 年 7 月 31 日 09:37:22 UTC (5,385 KB)

计算机科学 > 软件工程

标题： Trae代理：一种基于大语言模型的软件工程代理，具有测试时缩放功能

标题： Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 软件工程

标题： Trae代理：一种基于大语言模型的软件工程代理，具有测试时缩放功能 显示英文标题

标题： Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： Trae代理：一种基于大语言模型的软件工程代理，具有测试时缩放功能