CodeAgents: A Token-Efficient Framework for Codified Multi-Agent Reasoning in LLMs

Yang, Bruce; He, Xinfeng; Gao, Huan; Cao, Yifan; Li, Xiaofan; Hsu, David

计算机科学 > 人工智能

arXiv:2507.03254 (cs)

[提交于 2025年7月4日 ]

标题： CodeAgents：一种针对LLMs中编码多智能体推理的令牌高效框架

标题： CodeAgents: A Token-Efficient Framework for Codified Multi-Agent Reasoning in LLMs

Authors:Bruce Yang, Xinfeng He, Huan Gao, Yifan Cao, Xiaofan Li, David Hsu

摘要：有效的提示设计对于提升大型语言模型（LLM）驱动代理的规划能力至关重要。然而，现有的结构化提示策略通常仅限于单代理、仅计划的设置，并且通常仅根据任务准确性来评估性能——忽略了多代理环境中至关重要的因素，如令牌效率、模块化和可扩展性。为解决这些限制，我们引入了CodeAgents，一个将多代理推理编码化的提示框架，并在多代理系统中实现了结构化、令牌高效的规划。在CodeAgents中，所有代理交互的组件——任务、计划、反馈、系统角色和外部工具调用——都被编码为带有控制结构（例如循环、条件语句）的模块化伪代码，包括布尔逻辑和类型变量。这种设计将松散连接的代理计划转化为连贯、可解释和可验证的多代理推理程序。我们在三个不同的基准测试——GAIA、HotpotQA和VirtualHome上评估了所提出的框架，并使用了一系列代表性的LLM。结果表明，规划性能有稳定的提升，在自然语言提示基线上绝对提升了3-36个百分点。在VirtualHome上，我们的方法达到了56%的新最先进成功率。此外，我们的方法分别减少了输入和输出令牌使用量的55-87%和41-70%，强调了在可扩展多代理LLM系统开发中令牌感知评估指标的重要性。代码和资源可在以下网址获取：https://anonymous.4open.science/r/CodifyingAgent-5A86

摘要： Effective prompt design is essential for improving the planning capabilities of large language model (LLM)-driven agents. However, existing structured prompting strategies are typically limited to single-agent, plan-only settings, and often evaluate performance solely based on task accuracy - overlooking critical factors such as token efficiency, modularity, and scalability in multi-agent environments. To address these limitations, we introduce CodeAgents, a prompting framework that codifies multi-agent reasoning and enables structured, token-efficient planning in multi-agent systems. In CodeAgents, all components of agent interaction - Task, Plan, Feedback, system roles, and external tool invocations - are codified into modular pseudocode enriched with control structures (e.g., loops, conditionals), boolean logic, and typed variables. This design transforms loosely connected agent plans into cohesive, interpretable, and verifiable multi-agent reasoning programs. We evaluate the proposed framework across three diverse benchmarks - GAIA, HotpotQA, and VirtualHome - using a range of representative LLMs. Results show consistent improvements in planning performance, with absolute gains of 3-36 percentage points over natural language prompting baselines. On VirtualHome, our method achieves a new state-of-the-art success rate of 56%. In addition, our approach reduces input and output token usage by 55-87% and 41-70%, respectively, underscoring the importance of token-aware evaluation metrics in the development of scalable multi-agent LLM systems. The code and resources are available at: https://anonymous.4open.science/r/CodifyingAgent-5A86

主题：	人工智能 (cs.AI)
引用方式：	arXiv:2507.03254 [cs.AI]
	(或者 arXiv:2507.03254v1 [cs.AI] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.03254

提交历史

来自： Xinfeng He [查看电子邮件]
[v1] 星期五， 2025 年 7 月 4 日 02:20:19 UTC (645 KB)

计算机科学 > 人工智能

标题： CodeAgents：一种针对LLMs中编码多智能体推理的令牌高效框架

标题： CodeAgents: A Token-Efficient Framework for Codified Multi-Agent Reasoning in LLMs

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 人工智能

标题： CodeAgents：一种针对LLMs中编码多智能体推理的令牌高效框架 显示英文标题

标题： CodeAgents: A Token-Efficient Framework for Codified Multi-Agent Reasoning in LLMs

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： CodeAgents：一种针对LLMs中编码多智能体推理的令牌高效框架