Ragas: Automated Evaluation of Retrieval Augmented Generation

Es, Shahul; James, Jithin; Espinosa-Anke, Luis; Schockaert, Steven

Computer Science > Computation and Language

arXiv:2309.15217 (cs)

[Submitted on 26 Sep 2023 (v1) , last revised 28 Apr 2025 (this version, v2)]

Title: Ragas: Automated Evaluation of Retrieval Augmented Generation

Title: 拉格斯：检索增强生成的自动化评估

Authors:Shahul Es, Jithin James, Luis Espinosa-Anke, Steven Schockaert

Abstract: We introduce Ragas (Retrieval Augmented Generation Assessment), a framework for reference-free evaluation of Retrieval Augmented Generation (RAG) pipelines. RAG systems are composed of a retrieval and an LLM based generation module, and provide LLMs with knowledge from a reference textual database, which enables them to act as a natural language layer between a user and textual databases, reducing the risk of hallucinations. Evaluating RAG architectures is, however, challenging because there are several dimensions to consider: the ability of the retrieval system to identify relevant and focused context passages, the ability of the LLM to exploit such passages in a faithful way, or the quality of the generation itself. With Ragas, we put forward a suite of metrics which can be used to evaluate these different dimensions \textit{without having to rely on ground truth human annotations}. We posit that such a framework can crucially contribute to faster evaluation cycles of RAG architectures, which is especially important given the fast adoption of LLMs.

Abstract: 我们引入了Ragas（检索增强生成评估），这是一个用于无参考评估检索增强生成（RAG）管道的框架。 RAG系统由一个检索模块和一个基于大语言模型（LLM）的生成模块组成，并通过参考文本数据库为大语言模型提供知识，这使得它们能够作为用户和文本数据库之间的自然语言层，降低幻觉的风险。然而，评估RAG架构具有挑战性，因为需要考虑多个维度：检索系统识别相关且聚焦的上下文段落的能力，大语言模型以忠实方式利用这些段落的能力，或者生成内容本身的质量。通过Ragas，我们提出了一套指标，可用于评估这些不同维度 \textit{无需依赖真实的人类标注}。我们认为，这样的框架可以对RAG架构的快速评估周期起到关键作用，这在大语言模型迅速采用的背景下尤为重要。

Comments:	Reference-free (not tied to having ground truth available) evaluation framework for retrieval agumented generation
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2309.15217 [cs.CL]
	(or arXiv:2309.15217v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2309.15217

Submission history

From: Luis Espinosa-Anke [view email]
[v1] Tue, 26 Sep 2023 19:23:54 UTC (7,261 KB)
[v2] Mon, 28 Apr 2025 05:09:12 UTC (7,261 KB)

Computer Science > Computation and Language

Title: Ragas: Automated Evaluation of Retrieval Augmented Generation

Title: 拉格斯：检索增强生成的自动化评估

Submission history

Access Paper:

References & Citations

4 blog links

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title: Ragas: Automated Evaluation of Retrieval Augmented Generation Show Chinese title

Title: 拉格斯：检索增强生成的自动化评估

Submission history

Access Paper:

References & Citations

4 blog links

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Ragas: Automated Evaluation of Retrieval Augmented Generation