Learning when to rank: Estimation of partial rankings from sparse, noisy comparisons

Morel-Balbi, Sebastian; Kirkley, Alec

物理学 > 物理与社会

arXiv:2501.02505 (physics)

[提交于 2025年1月5日 (v1) ，最后修订 2025年4月18日 (此版本， v2)]

标题：学习何时排名：从稀疏、嘈杂的比较中估计部分排名

标题： Learning when to rank: Estimation of partial rankings from sparse, noisy comparisons

Authors:Sebastian Morel-Balbi, Alec Kirkley

摘要：在各种领域中经常出现的一项常见任务是基于成对比较的结果对项目进行排名，从体育运动中的运动员和团队排名到市场营销研究和推荐系统中的产品或品牌排名。基于统计推断的方法，例如布拉德利-特里模型（Bradley-Terry model），通过提取潜在生成比较结果的模型来确定排名，已经成为处理经验数据中排名任务的强大且灵活的工具。在有限和/或嘈杂的比较情况下，通常很难根据数据中可用的证据有信心地区分不同项目的性能。然而，现有的基于推断的排名方法几乎总是为每个项目分配一个唯一的排名或分数，当没有显著差异时仍暗示存在有意义的区别。在这里，我们通过开发一种原则性的贝叶斯方法来解决这个问题，该方法用于学习部分排名——允许排名中有平局的排名——并且仅在数据中有足够证据支持时才区分不同项目的排名。我们的框架适用于任何统计排名方法，其中成对观察的结果取决于被比较项目的排名或分数。我们开发了一种快速聚合算法，在我们的框架下执行部分排名的最大后验概率（MAP）推断，并在多种真实和合成网络数据集上检验了我们的方法的表现，发现它在许多情况下比传统的排名提供了更简洁的数据总结，特别是在观测稀疏的情况下更是如此。

摘要： A common task arising in various domains is that of ranking items based on the outcomes of pairwise comparisons, from ranking players and teams in sports to ranking products or brands in marketing studies and recommendation systems. Statistical inference-based methods such as the Bradley-Terry model, which extract rankings based on an underlying generative model of the comparison outcomes, have emerged as flexible and powerful tools to tackle the task of ranking in empirical data. In situations with limited and/or noisy comparisons, it is often challenging to confidently distinguish the performance of different items based on the evidence available in the data. However, existing inference-based ranking methods overwhelmingly choose to assign each item to a unique rank or score, suggesting a meaningful distinction when there is none. Here, we address this problem by developing a principled Bayesian methodology for learning partial rankings -- rankings with ties -- that distinguishes among the ranks of different items only when there is sufficient evidence available in the data. Our framework is adaptable to any statistical ranking method in which the outcomes of pairwise observations depend on the ranks or scores of the items being compared. We develop a fast agglomerative algorithm to perform Maximum A Posteriori (MAP) inference of partial rankings under our framework and examine the performance of our method on a variety of real and synthetic network datasets, finding that it frequently gives a more parsimonious summary of the data than traditional ranking, particularly when observations are sparse.

评论：	20页，8幅图，1张表格
主题：	物理与社会 (physics.soc-ph) ; 社会与信息网络 (cs.SI); 机器学习 (stat.ML)
引用方式：	arXiv:2501.02505 [physics.soc-ph]
	(或者 arXiv:2501.02505v2 [physics.soc-ph] 对于此版本)
	https://doi.org/10.48550/arXiv.2501.02505

提交历史

来自： Sebastian Morel-Balbi [查看电子邮件]
[v1] 星期日， 2025 年 1 月 5 日 11:04:30 UTC (10,875 KB)
[v2] 星期五， 2025 年 4 月 18 日 09:26:21 UTC (11,352 KB)

物理学 > 物理与社会

标题：学习何时排名：从稀疏、嘈杂的比较中估计部分排名

标题： Learning when to rank: Estimation of partial rankings from sparse, noisy comparisons

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

物理学 > 物理与社会

标题： 学习何时排名：从稀疏、嘈杂的比较中估计部分排名 显示英文标题

标题： Learning when to rank: Estimation of partial rankings from sparse, noisy comparisons

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：学习何时排名：从稀疏、嘈杂的比较中估计部分排名