Multiclass Classification, Information, Divergence, and Surrogate Risk

Duchi, John C.; Khosravi, Khashayar; Ruan, Feng

数学 > 统计理论

arXiv:1603.00126 (math)

[提交于 2016年3月1日 (v1) ，最后修订 2017年9月10日 (此版本， v2)]

标题：多分类分类，信息，发散度和代理风险

标题： Multiclass Classification, Information, Divergence, and Surrogate Risk

Authors:John C. Duchi, Khashayar Khosravi, Feng Ruan

摘要：我们提供了一个统一的观点来审视统计信息测度、多路贝叶斯假设检验、多类分类问题的损失函数以及多分布的$f$-散度，并详细阐述了这些对象之间的等价性结果，同时将现有的二元结果推广到更一般的场景。我们考虑将$f$-散度推广到多个分布的情况，并且给出了散度、统计信息（按照 DeGroot 的定义）和多类分类损失之间的构造性等价关系。我们的结果的一个主要应用是在多类分类问题中，其中我们需要推断一个判别函数$\gamma$——用于从数据$X$预测标签$Y$——以及一个数据表示（或者，在假设检验问题的背景下，实验设计），用量化器$\mathsf{q}$表示，该量化器来自可能的量化器集合$\mathsf{Q}$。在此设定下，我们刻画了损失函数之间的等价性，这意味着优化两种损失函数中的任意一种都能得到最优的判别器和量化器$\mathsf{q}$，补充并扩展了 Nguyen 等人的早期成果到多分类情况。我们的结果比标准分类校准结果提供了更有力的比较不同损失的基础：我们描述了在联合选择数据表示并最小化多分类问题中的（加权）错误概率时，一致的凸损失函数。

摘要： We provide a unifying view of statistical information measures, multi-way Bayesian hypothesis testing, loss functions for multi-class classification problems, and multi-distribution $f$-divergences, elaborating equivalence results between all of these objects, and extending existing results for binary outcome spaces to more general ones. We consider a generalization of $f$-divergences to multiple distributions, and we provide a constructive equivalence between divergences, statistical information (in the sense of DeGroot), and losses for multiclass classification. A major application of our results is in multi-class classification problems in which we must both infer a discriminant function $\gamma$---for making predictions on a label $Y$ from datum $X$---and a data representation (or, in the setting of a hypothesis testing problem, an experimental design), represented as a quantizer $\mathsf{q}$ from a family of possible quantizers $\mathsf{Q}$. In this setting, we characterize the equivalence between loss functions, meaning that optimizing either of two losses yields an optimal discriminant and quantizer $\mathsf{q}$, complementing and extending earlier results of Nguyen et. al. to the multiclass case. Our results provide a more substantial basis than standard classification calibration results for comparing different losses: we describe the convex losses that are consistent for jointly choosing a data representation and minimizing the (weighted) probability of error in multiclass classification problems.

主题：	统计理论 (math.ST) ; 信息论 (cs.IT)
引用方式：	arXiv:1603.00126 [math.ST]
	(或者 arXiv:1603.00126v2 [math.ST] 对于此版本)
	https://doi.org/10.48550/arXiv.1603.00126

提交历史

来自： Khashayar Khosravi [查看电子邮件]
[v1] 星期二， 2016 年 3 月 1 日 03:28:27 UTC (74 KB)
[v2] 星期日， 2017 年 9 月 10 日 20:27:00 UTC (114 KB)

数学 > 统计理论

标题：多分类分类，信息，发散度和代理风险

标题： Multiclass Classification, Information, Divergence, and Surrogate Risk

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

数学 > 统计理论

标题： 多分类分类，信息，发散度和代理风险 显示英文标题

标题： Multiclass Classification, Information, Divergence, and Surrogate Risk

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：多分类分类，信息，发散度和代理风险