Scalable Policy Maximization Under Network Interference

Gleich, Aidan; Laber, Eric; Volfovsky, Alexander

Statistics > Machine Learning

arXiv:2505.18118 (stat)

[Submitted on 23 May 2025 ]

Title: Scalable Policy Maximization Under Network Interference

Title: 可扩展的网络干扰下的策略最大化

Authors:Aidan Gleich, Eric Laber, Alexander Volfovsky

Abstract: Many interventions, such as vaccines in clinical trials or coupons in online marketplaces, must be assigned sequentially without full knowledge of their effects. Multi-armed bandit algorithms have proven successful in such settings. However, standard independence assumptions fail when the treatment status of one individual impacts the outcomes of others, a phenomenon known as interference. We study optimal-policy learning under interference on a dynamic network. Existing approaches to this problem require repeated observations of the same fixed network and struggle to scale in sample size beyond as few as fifteen connected units -- both limit applications. We show that under common assumptions on the structure of interference, rewards become linear. This enables us to develop a scalable Thompson sampling algorithm that maximizes policy impact when a new $n$-node network is observed each round. We prove a Bayesian regret bound that is sublinear in $n$ and the number of rounds. Simulation experiments show that our algorithm learns quickly and outperforms existing methods. The results close a key scalability gap between causal inference methods for interference and practical bandit algorithms, enabling policy optimization in large-scale networked systems.

Abstract: 许多干预措施，比如临床试验中的疫苗或在线市场中的优惠券，必须按顺序分配，而无法完全了解它们的影响。多臂老虎机算法在这种情况下已被证明非常成功。然而，当一个人的治疗状态影响到他人的结果时，标准的独立性假设就会失效，这种现象被称为干扰。我们研究了动态网络上干扰下的最优策略学习问题。现有针对该问题的方法需要反复观察相同的固定网络，并且难以在样本量超过十五个连通单元时扩展——这些都限制了应用范围。我们表明，在常见的干扰结构假设下，奖励会变为线性。这使我们能够开发一种可扩展的汤普森抽样算法，当每轮观察到一个新的$n$-节点网络时，最大化策略影响。我们证明了一个贝叶斯后悔界，该界在$n$和轮数上均为次线性。模拟实验表明，我们的算法学习速度快且优于现有方法。这一成果弥合了干扰因果推断方法与实用老虎机算法之间的一个关键可扩展性差距，从而使得大规模网络化系统中的政策优化成为可能。

Subjects:	Machine Learning (stat.ML) ; Machine Learning (cs.LG)
Cite as:	arXiv:2505.18118 [stat.ML]
	(or arXiv:2505.18118v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2505.18118

Submission history

From: Aidan Gleich [view email]
[v1] Fri, 23 May 2025 17:19:12 UTC (163 KB)

Statistics > Machine Learning

Title: Scalable Policy Maximization Under Network Interference

Title: 可扩展的网络干扰下的策略最大化

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title: Scalable Policy Maximization Under Network Interference Show Chinese title

Title: 可扩展的网络干扰下的策略最大化

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Scalable Policy Maximization Under Network Interference