Refereed Learning

Canetti, Ran; Linder, Ephraim; Wagaman, Connor

统计学 > 机器学习

arXiv:2510.05440 (stat)

[提交于 2025年10月6日 ]

标题：参考学习

标题： Refereed Learning

Authors:Ran Canetti, Ephraim Linder, Connor Wagaman

摘要：我们开始研究一种学习任务的设定，在这种设定中，学习者可以访问两个竞争的证明者，其中只有一个诚实。具体来说，我们考虑这种学习者在评估不透明模型的所谓属性方面的能力。在借鉴了之前研究竞争证明者在不同设定中的能力的工作之后，我们将这一设定称为裁判学习。在提出裁判学习任务的一般定义后，我们展示了能够获得远超没有证明者或甚至仅有一个证明者时可达到的准确性的裁判学习协议。我们专注于在某种真实情况下的情况下选择两个黑盒模型中的更好一个的任务。虽然我们考虑了一系列参数，但或许我们最显著的结果是在高精度范围内：对于所有$\varepsilon>0$和环境维度$d$，我们的学习者只向真实函数查询一次，与证明者通信仅$(1+\frac{1}{\varepsilon^2})\cdot\text{poly}(d)$位，并输出一个损失值在最佳模型损失值的乘法因子$(1+\varepsilon)$范围内的模型。使用单一证明者获得类似的损失值将要求学习者在域中的几乎所有点上访问真实函数。为了获得这个界限，我们开发了一种技术，使学习者能够使用证明者从一开始就不容易采样的分布中进行采样。我们发现这种技术本身也有独立的兴趣。我们还提出了下界，以证明我们的协议在多个方面是最佳的，包括证明者复杂度、样本数量以及查询访问的必要性。

摘要： We initiate an investigation of learning tasks in a setting where the learner is given access to two competing provers, only one of which is honest. Specifically, we consider the power of such learners in assessing purported properties of opaque models. Following prior work that considers the power of competing provers in different settings, we call this setting refereed learning. After formulating a general definition of refereed learning tasks, we show refereed learning protocols that obtain a level of accuracy that far exceeds what is obtainable at comparable cost without provers, or even with a single prover. We concentrate on the task of choosing the better one out of two black-box models, with respect to some ground truth. While we consider a range of parameters, perhaps our most notable result is in the high-precision range: For all $\varepsilon>0$ and ambient dimension $d$, our learner makes only one query to the ground truth function, communicates only $(1+\frac{1}{\varepsilon^2})\cdot\text{poly}(d)$ bits with the provers, and outputs a model whose loss is within a multiplicative factor of $(1+\varepsilon)$ of the best model's loss. Obtaining comparable loss with a single prover would require the learner to access the ground truth at almost all of the points in the domain. To obtain this bound, we develop a technique that allows the learner to sample, using the provers, from a distribution that is not efficiently samplable to begin with. We find this technique to be of independent interest. We also present lower bounds that demonstrate the optimality of our protocols in a number of respects, including prover complexity, number of samples, and need for query access.

主题：	机器学习 (stat.ML) ; 密码学与安全 (cs.CR); 机器学习 (cs.LG)
引用方式：	arXiv:2510.05440 [stat.ML]
	(或者 arXiv:2510.05440v1 [stat.ML] 对于此版本)
	https://doi.org/10.48550/arXiv.2510.05440

提交历史

来自： Ephraim Linder [查看电子邮件]
[v1] 星期一， 2025 年 10 月 6 日 23:07:31 UTC (46 KB)

统计学 > 机器学习

标题：参考学习

标题： Refereed Learning

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 机器学习

标题： 参考学习 显示英文标题

标题： Refereed Learning

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：参考学习