Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting

Wang, Zhihao; Cornacchia, Alessandro; Galante, Franco; Centofanti, Carlo; Sacco, Alessio; Jiang, Dingde

计算机科学 > 网络与互联网架构

arXiv:2507.01997 (cs)

[提交于 2025年7月1日 ]

标题：迈向一个平台，以民主化人工智能代理在网络故障排除中的实验和基准测试

标题： Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting

Authors:Zhihao Wang, Alessandro Cornacchia, Franco Galante, Carlo Centofanti, Alessio Sacco, Dingde Jiang

摘要：最近的研究已经证明了人工智能（AI）以及更具体地说，大型语言模型（LLMs）在支持网络配置综合和自动化网络诊断任务等方面的有效性。在这项初步工作中，我们将重点限制在AI代理在网络故障排除中的应用，并详细说明需要一个标准化、可重复和开放的基准测试平台，以便以较低的操作成本构建和评估AI代理。

摘要： Recent research has demonstrated the effectiveness of Artificial Intelligence (AI), and more specifically, Large Language Models (LLMs), in supporting network configuration synthesis and automating network diagnosis tasks, among others. In this preliminary work, we restrict our focus to the application of AI agents to network troubleshooting and elaborate on the need for a standardized, reproducible, and open benchmarking platform, where to build and evaluate AI agents with low operational effort.

评论：	已被ACM SIGCOMM首届下一代网络可观察性研讨会（NGNO）接收
主题：	网络与互联网架构 (cs.NI) ; 人工智能 (cs.AI); 多智能体系统 (cs.MA)
引用方式：	arXiv:2507.01997 [cs.NI]
	(或者 arXiv:2507.01997v1 [cs.NI] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.01997

提交历史

来自： Alessandro Cornacchia [查看电子邮件]
[v1] 星期二， 2025 年 7 月 1 日 08:46:37 UTC (581 KB)

计算机科学 > 网络与互联网架构

标题：迈向一个平台，以民主化人工智能代理在网络故障排除中的实验和基准测试

标题： Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 网络与互联网架构

标题： 迈向一个平台，以民主化人工智能代理在网络故障排除中的实验和基准测试 显示英文标题

标题： Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：迈向一个平台，以民主化人工智能代理在网络故障排除中的实验和基准测试