Agentic AutoSurvey: Let LLMs Survey LLMs

Liu, Yixin; Wu, Yonghui; Zhang, Denghui; Sun, Lichao

Computer Science > Information Retrieval

arXiv:2509.18661 (cs)

[Submitted on 23 Sep 2025 ]

Title: Agentic AutoSurvey: Let LLMs Survey LLMs

Title: 代理自动调查：让大语言模型调查大语言模型

Authors:Yixin Liu, Yonghui Wu, Denghui Zhang, Lichao Sun

Abstract: The exponential growth of scientific literature poses unprecedented challenges for researchers attempting to synthesize knowledge across rapidly evolving fields. We present \textbf{Agentic AutoSurvey}, a multi-agent framework for automated survey generation that addresses fundamental limitations in existing approaches. Our system employs four specialized agents (Paper Search Specialist, Topic Mining \& Clustering, Academic Survey Writer, and Quality Evaluator) working in concert to generate comprehensive literature surveys with superior synthesis quality. Through experiments on six representative LLM research topics from COLM 2024 categories, we demonstrate that our multi-agent approach achieves significant improvements over existing baselines, scoring 8.18/10 compared to AutoSurvey's 4.77/10. The multi-agent architecture processes 75--443 papers per topic (847 total across six topics) while targeting high citation coverage (often $\geq$80\% on 75--100-paper sets; lower on very large sets such as RLHF) through specialized agent orchestration. Our 12-dimension evaluation captures organization, synthesis integration, and critical analysis beyond basic metrics. These findings demonstrate that multi-agent architectures represent a meaningful advancement for automated literature survey generation in rapidly evolving scientific domains.

Abstract: 科学文献的指数增长给研究人员在快速发展的领域中综合知识带来了前所未有的挑战。我们提出了\textbf{代理自动调查}，一种用于自动生成综述的多智能体框架，解决了现有方法中的基本局限性。我们的系统采用四个专业智能体（论文搜索专家、主题挖掘与聚类、学术综述撰写者和质量评估者）协同工作，以生成具有卓越综合质量的全面文献综述。通过在COLM 2024类别中的六个代表性大型语言模型研究主题进行实验，我们证明了我们的多智能体方法相比现有基线取得了显著改进，得分为8.18/10，而AutoSurvey的得分为4.77/10。多智能体架构每个主题处理75--443篇论文（六个主题共847篇），并通过专门的智能体协调实现高引用覆盖率（通常在75--100篇论文集上达到$\geq$80%；在非常大的集如RLHF上较低）。我们的12维评估指标超越了基本指标，涵盖了组织结构、综合整合和批判性分析。这些发现表明，多智能体架构在快速发展的科学领域中自动文献综述生成方面代表了一种有意义的进展。

Comments:	29 pages, 7 figures
Subjects:	Information Retrieval (cs.IR) ; Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2509.18661 [cs.IR]
	(or arXiv:2509.18661v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2509.18661

Submission history

From: Yixin Liu [view email]
[v1] Tue, 23 Sep 2025 05:28:43 UTC (636 KB)

Computer Science > Information Retrieval

Title: Agentic AutoSurvey: Let LLMs Survey LLMs

Title: 代理自动调查：让大语言模型调查大语言模型

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title: Agentic AutoSurvey: Let LLMs Survey LLMs Show Chinese title

Title: 代理自动调查：让大语言模型调查大语言模型

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Agentic AutoSurvey: Let LLMs Survey LLMs