Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2309.15217

Help | Advanced Search

Computer Science > Computation and Language

arXiv:2309.15217 (cs)
[Submitted on 26 Sep 2023 (v1) , last revised 28 Apr 2025 (this version, v2)]

Title: Ragas: Automated Evaluation of Retrieval Augmented Generation

Title: 拉格斯:检索增强生成的自动化评估

Authors:Shahul Es, Jithin James, Luis Espinosa-Anke, Steven Schockaert
Abstract: We introduce Ragas (Retrieval Augmented Generation Assessment), a framework for reference-free evaluation of Retrieval Augmented Generation (RAG) pipelines. RAG systems are composed of a retrieval and an LLM based generation module, and provide LLMs with knowledge from a reference textual database, which enables them to act as a natural language layer between a user and textual databases, reducing the risk of hallucinations. Evaluating RAG architectures is, however, challenging because there are several dimensions to consider: the ability of the retrieval system to identify relevant and focused context passages, the ability of the LLM to exploit such passages in a faithful way, or the quality of the generation itself. With Ragas, we put forward a suite of metrics which can be used to evaluate these different dimensions \textit{without having to rely on ground truth human annotations}. We posit that such a framework can crucially contribute to faster evaluation cycles of RAG architectures, which is especially important given the fast adoption of LLMs.
Abstract: 我们引入了Ragas(检索增强生成评估),这是一个用于无参考评估检索增强生成(RAG)管道的框架。 RAG系统由一个检索模块和一个基于大语言模型(LLM)的生成模块组成,并通过参考文本数据库为大语言模型提供知识,这使得它们能够作为用户和文本数据库之间的自然语言层,降低幻觉的风险。 然而,评估RAG架构具有挑战性,因为需要考虑多个维度:检索系统识别相关且聚焦的上下文段落的能力,大语言模型以忠实方式利用这些段落的能力,或者生成内容本身的质量。 通过Ragas,我们提出了一套指标,可用于评估这些不同维度 \textit{无需依赖真实的人类标注}。 我们认为,这样的框架可以对RAG架构的快速评估周期起到关键作用,这在大语言模型迅速采用的背景下尤为重要。
Comments: Reference-free (not tied to having ground truth available) evaluation framework for retrieval agumented generation
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2309.15217 [cs.CL]
  (or arXiv:2309.15217v2 [cs.CL] for this version)
  https://doi.org/10.48550/arXiv.2309.15217
arXiv-issued DOI via DataCite

Submission history

From: Luis Espinosa-Anke [view email]
[v1] Tue, 26 Sep 2023 19:23:54 UTC (7,261 KB)
[v2] Mon, 28 Apr 2025 05:09:12 UTC (7,261 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
license icon view license
Current browse context:
cs.CL
< prev   |   next >
new | recent | 2023-09
Change to browse by:
cs

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar

4 blog links

(what is this?)
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号