SciNUP: Natural Language User Interest Profiles for Scientific Literature Recommendation

Arustashvili, Mariam; Balog, Krisztian

计算机科学 > 信息检索

arXiv:2510.21352 (cs)

[提交于 2025年10月24日 ]

标题： SciNUP：科学文献推荐的自然语言用户兴趣档案

标题： SciNUP: Natural Language User Interest Profiles for Scientific Literature Recommendation

Authors:Mariam Arustashvili, Krisztian Balog

摘要：自然语言（NL）用户资料在推荐系统中的使用相比传统表示方式提供了更高的透明度和用户控制能力。然而，目前缺乏大规模、公开可用的测试集合来评估基于NL资料的推荐。为解决这一差距，我们引入了SciNUP，这是一个新颖的合成数据集，用于学术推荐，它利用作者的出版历史生成NL资料和相应的真实项目。我们使用这个数据集对基线方法进行了比较，包括从稀疏和密集检索方法到最先进的基于LLM的重新排序器。我们的结果表明，尽管基线方法表现出相当的性能，但它们经常检索不同的项目，表明它们的行为具有互补性。同时，仍有很大的改进空间，突显了有效基于NL的推荐方法的必要性。因此，SciNUP数据集成为促进该领域未来研究和开发的宝贵资源。

摘要： The use of natural language (NL) user profiles in recommender systems offers greater transparency and user control compared to traditional representations. However, there is scarcity of large-scale, publicly available test collections for evaluating NL profile-based recommendation. To address this gap, we introduce SciNUP, a novel synthetic dataset for scholarly recommendation that leverages authors' publication histories to generate NL profiles and corresponding ground truth items. We use this dataset to conduct a comparison of baseline methods, ranging from sparse and dense retrieval approaches to state-of-the-art LLM-based rerankers. Our results show that while baseline methods achieve comparable performance, they often retrieve different items, indicating complementary behaviors. At the same time, considerable headroom for improvement remains, highlighting the need for effective NL-based recommendation approaches. The SciNUP dataset thus serves as a valuable resource for fostering future research and development in this area.

主题：	信息检索 (cs.IR)
引用方式：	arXiv:2510.21352 [cs.IR]
	(或者 arXiv:2510.21352v1 [cs.IR] 对于此版本)
	https://doi.org/10.48550/arXiv.2510.21352

提交历史

来自： Mariam Arustashvili [查看电子邮件]
[v1] 星期五， 2025 年 10 月 24 日 11:28:08 UTC (197 KB)

计算机科学 > 信息检索

标题： SciNUP：科学文献推荐的自然语言用户兴趣档案

标题： SciNUP: Natural Language User Interest Profiles for Scientific Literature Recommendation

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 信息检索

标题： SciNUP：科学文献推荐的自然语言用户兴趣档案 显示英文标题

标题： SciNUP: Natural Language User Interest Profiles for Scientific Literature Recommendation

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： SciNUP：科学文献推荐的自然语言用户兴趣档案