KG-EDAS: A Meta-Metric Framework for Evaluating Knowledge Graph Completion Models

Gul, Haji; Naim, Abul Ghani; Bhat, Ajaz Ahmad

计算机科学 > 计算与语言

arXiv:2508.15357 (cs)

[提交于 2025年8月21日 ]

标题： KG-EDAS：一种用于评估知识图谱补全模型的元度量框架

标题： KG-EDAS: A Meta-Metric Framework for Evaluating Knowledge Graph Completion Models

Authors:Haji Gul, Abul Ghani Naim, Ajaz Ahmad Bhat

摘要：知识图谱（KGs）在语义搜索、推荐系统和自然语言处理等多个领域中都有应用。 KGs通常不完整，缺少实体和关系，这一问题通过知识图谱补全（KGC）方法来解决，这些方法可以预测缺失的元素。常用的评估指标包括平均倒数排名（MRR）、平均排名（MR）和Hit@k，用于评估此类KGC模型的性能。然而，评估KGC模型的一个主要挑战在于在多个数据集和指标之间比较其性能。一个模型可能在一个数据集上优于其他模型，但在另一个数据集中表现较差，这使得确定整体优势变得困难。此外，即使在单个数据集中，不同的指标如MRR和Hit@1也可能产生冲突的排名，其中一个模型在MRR上表现优异，而另一个模型在Hit@1上表现更好，这进一步增加了下游任务中模型选择的复杂性。这些不一致性阻碍了全面的比较，并突显了需要一种统一的元指标，该指标能够整合所有指标和数据集上的性能，以实现更可靠和可解释的评估框架。为了解决这一需求，我们提出了基于平均解距离的知识图谱评估（EDAS），这是一种稳健且可解释的元指标，能够将多个数据集和多样化的评估标准下的模型性能综合成一个归一化分数（$M_i \in [0,1]$）。与传统指标仅关注性能的孤立方面不同，EDAS提供了一个全局视角，支持更明智的模型选择，并促进了跨数据集评估的公平性。在基准数据集如FB15k-237和WN18RR上的实验结果表明，EDAS能够有效地将多指标、多数据集的性能整合到一个统一的排名中，为评估KGC模型提供了一个一致、稳健且可推广的框架。

摘要： Knowledge Graphs (KGs) enable applications in various domains such as semantic search, recommendation systems, and natural language processing. KGs are often incomplete, missing entities and relations, an issue addressed by Knowledge Graph Completion (KGC) methods that predict missing elements. Different evaluation metrics, such as Mean Reciprocal Rank (MRR), Mean Rank (MR), and Hit@k, are commonly used to assess the performance of such KGC models. A major challenge in evaluating KGC models, however, lies in comparing their performance across multiple datasets and metrics. A model may outperform others on one dataset but underperform on another, making it difficult to determine overall superiority. Moreover, even within a single dataset, different metrics such as MRR and Hit@1 can yield conflicting rankings, where one model excels in MRR while another performs better in Hit@1, further complicating model selection for downstream tasks. These inconsistencies hinder holistic comparisons and highlight the need for a unified meta-metric that integrates performance across all metrics and datasets to enable a more reliable and interpretable evaluation framework. To address this need, we propose KG Evaluation based on Distance from Average Solution (EDAS), a robust and interpretable meta-metric that synthesizes model performance across multiple datasets and diverse evaluation criteria into a single normalized score ($M_i \in [0,1]$). Unlike traditional metrics that focus on isolated aspects of performance, EDAS offers a global perspective that supports more informed model selection and promotes fairness in cross-dataset evaluation. Experimental results on benchmark datasets such as FB15k-237 and WN18RR demonstrate that EDAS effectively integrates multi-metric, multi-dataset performance into a unified ranking, offering a consistent, robust, and generalizable framework for evaluating KGC models.

主题：	计算与语言 (cs.CL) ; 性能 (cs.PF)
引用方式：	arXiv:2508.15357 [cs.CL]
	(或者 arXiv:2508.15357v1 [cs.CL] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.15357

提交历史

来自： Haji Gul [查看电子邮件]
[v1] 星期四， 2025 年 8 月 21 日 08:37:35 UTC (74 KB)

计算机科学 > 计算与语言

标题： KG-EDAS：一种用于评估知识图谱补全模型的元度量框架

标题： KG-EDAS: A Meta-Metric Framework for Evaluating Knowledge Graph Completion Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算与语言

标题： KG-EDAS：一种用于评估知识图谱补全模型的元度量框架 显示英文标题

标题： KG-EDAS: A Meta-Metric Framework for Evaluating Knowledge Graph Completion Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： KG-EDAS：一种用于评估知识图谱补全模型的元度量框架