Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Jha, Kunal; Carvalho, Wilka; Liang, Yancheng; Du, Simon S.; Kleiman-Weiner, Max; Jaques, Natasha

Computer Science > Multiagent Systems

arXiv:2504.12714 (cs)

[Submitted on 17 Apr 2025 (v1) , last revised 20 Apr 2025 (this version, v2)]

Title: Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Title: 跨环境协作实现零样本多智能体协调

Authors:Kunal Jha, Wilka Carvalho, Yancheng Liang, Simon S. Du, Max Kleiman-Weiner, Natasha Jaques

Abstract: Zero-shot coordination (ZSC), the ability to adapt to a new partner in a cooperative task, is a critical component of human-compatible AI. While prior work has focused on training agents to cooperate on a single task, these specialized models do not generalize to new tasks, even if they are highly similar. Here, we study how reinforcement learning on a distribution of environments with a single partner enables learning general cooperative skills that support ZSC with many new partners on many new problems. We introduce two Jax-based, procedural generators that create billions of solvable coordination challenges. We develop a new paradigm called Cross-Environment Cooperation (CEC), and show that it outperforms competitive baselines quantitatively and qualitatively when collaborating with real people. Our findings suggest that learning to collaborate across many unique scenarios encourages agents to develop general norms, which prove effective for collaboration with different partners. Together, our results suggest a new route toward designing generalist cooperative agents capable of interacting with humans without requiring human data.

Abstract: 零样本协作（ZSC），即在合作任务中适应新伙伴的能力，是人兼容人工智能的关键组成部分。尽管之前的工作集中在训练代理在单一任务上进行合作，但这些专业模型即使在高度相似的任务上也无法泛化。在这里，我们研究了在具有单一合作伙伴的环境分布上进行强化学习，如何使学习到的通用合作技能支持与许多新合作伙伴在许多新问题上的零样本协作。我们引入了两个基于Jax的程序生成器，可以创建数十亿个可解决的协作挑战。我们开发了一种新的范式，称为跨环境协作（CEC），并表明当与真实人类合作时，它在定量和定性方面都优于竞争性基线。我们的发现表明，在许多独特场景中学习协作会促使代理发展出通用规范，这些规范在与不同合作伙伴协作时证明是有效的。总的来说，我们的结果表明了一条新的设计通用合作代理的途径，这些代理能够在不需人类数据的情况下与人类互动。

Comments:	Accepted to CogSci 2025, In-review for ICML 2025
Subjects:	Multiagent Systems (cs.MA) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2504.12714 [cs.MA]
	(or arXiv:2504.12714v2 [cs.MA] for this version)
	https://doi.org/10.48550/arXiv.2504.12714

Submission history

From: Kunal Jha [view email]
[v1] Thu, 17 Apr 2025 07:41:25 UTC (1,796 KB)
[v2] Sun, 20 Apr 2025 20:10:41 UTC (1,504 KB)

Computer Science > Multiagent Systems

Title: Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Title: 跨环境协作实现零样本多智能体协调

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Multiagent Systems

Title: Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination Show Chinese title

Title: 跨环境协作实现零样本多智能体协调

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination