Reinforcement Learning for Hanabi

Cohen, Nina; France, Kordel K.

计算机科学 > 机器学习

arXiv:2506.00458 (cs)

[提交于 2025年5月31日 ]

标题：强化学习在Hanabi中的应用

标题： Reinforcement Learning for Hanabi

Authors:Nina Cohen, Kordel K. France

摘要： Hanabi 在强化学习（RL）研究中已成为一款热门游戏，因为它是少数几款合作式卡牌游戏中的一种，在这类游戏中，玩家对整个环境的了解是不完全的，从而为 RL 代理提出了挑战。我们探索了不同的表格化和深度强化学习算法，以查看哪种算法在对抗同类型代理以及不同类型代理时表现最佳。我们确定某些代理在针对特定代理时打出了最高分的游戏，而其他代理则通过适应对手代理的行为，平均得分更高。我们试图量化每种算法在什么条件下能提供最佳优势，并识别出不同类型代理之间最有趣的交互。最终，我们发现时间差分（TD）算法的整体性能和玩法类型的平衡优于表格化代理。具体来说，表格化 Expected SARSA 和深度 Q 学习代理表现出最佳性能。

摘要： Hanabi has become a popular game for research when it comes to reinforcement learning (RL) as it is one of the few cooperative card games where you have incomplete knowledge of the entire environment, thus presenting a challenge for a RL agent. We explored different tabular and deep reinforcement learning algorithms to see which had the best performance both against an agent of the same type and also against other types of agents. We establish that certain agents played their highest scoring games against specific agents while others exhibited higher scores on average by adapting to the opposing agent's behavior. We attempted to quantify the conditions under which each algorithm provides the best advantage and identified the most interesting interactions between agents of different types. In the end, we found that temporal difference (TD) algorithms had better overall performance and balancing of play types compared to tabular agents. Specifically, tabular Expected SARSA and deep Q-Learning agents showed the best performance.

主题：	机器学习 (cs.LG) ; 人工智能 (cs.AI); 计算机科学与博弈论 (cs.GT); 多智能体系统 (cs.MA)
引用方式：	arXiv:2506.00458 [cs.LG]
	(或者 arXiv:2506.00458v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.00458

提交历史

来自： Kordel France [查看电子邮件]
[v1] 星期六， 2025 年 5 月 31 日 08:24:16 UTC (1,000 KB)

计算机科学 > 机器学习

标题：强化学习在Hanabi中的应用

标题： Reinforcement Learning for Hanabi

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 强化学习在Hanabi中的应用 显示英文标题

标题： Reinforcement Learning for Hanabi

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：强化学习在Hanabi中的应用