Sparse mean localization by information theory

Diaz, Emiliano

统计学 > 应用

arXiv:1704.00575 (stat)

[提交于 2017年4月3日 ]

标题：稀疏均值定位的信息论方法

标题： Sparse mean localization by information theory

Authors:Emiliano Diaz

摘要：稀疏特征选择在拟合统计模型时是必要的，当我们拥有大量的特征，但不知道哪些是相关的，并且假设大多数都不是。或者，当特征的数量大于可用数据时，模型变得过度参数化，稀疏特征选择任务涉及为模型选择最具有信息量的变量。当模型是一个简单的定位模型，并且相关特征的数量不随总特征数量增长时，稀疏特征选择对应于稀疏均值估计。我们处理一个简化的均值估计问题，该问题包括一个带有高斯噪声的加性模型和一个处于受限有限假设空间中的均值。这种限制将均值估计问题简化为一种组合性质的选择问题。尽管假设空间是有限的，但它的大小在均值维度上呈指数增长。在数据量有限的情况下，以及当假设空间的大小依赖于数据量或数据维度时，选择一组近似假设是一种可取的方法。选择一组假设而不是单一假设意味着用分辨率-稳定性权衡替代偏差-方差权衡。泛化能力提供了一种基于允许学习算法无错误地向学习者传达数据中最大信息量的分辨率选择标准。在这项工作中，探索了近似集编码理论和泛化能力理论以理解这种方法。然后我们将泛化能力准则应用于简化的稀疏均值估计问题，并详细描述了一种重要性抽样算法，该算法一次性解决了由大假设空间导致的困难和均匀抽样算法收敛缓慢的问题。

摘要： Sparse feature selection is necessary when we fit statistical models, we have access to a large group of features, don't know which are relevant, but assume that most are not. Alternatively, when the number of features is larger than the available data the model becomes over parametrized and the sparse feature selection task involves selecting the most informative variables for the model. When the model is a simple location model and the number of relevant features does not grow with the total number of features, sparse feature selection corresponds to sparse mean estimation. We deal with a simplified mean estimation problem consisting of an additive model with gaussian noise and mean that is in a restricted, finite hypothesis space. This restriction simplifies the mean estimation problem into a selection problem of combinatorial nature. Although the hypothesis space is finite, its size is exponential in the dimension of the mean. In limited data settings and when the size of the hypothesis space depends on the amount of data or on the dimension of the data, choosing an approximation set of hypotheses is a desirable approach. Choosing a set of hypotheses instead of a single one implies replacing the bias-variance trade off with a resolution-stability trade off. Generalization capacity provides a resolution selection criterion based on allowing the learning algorithm to communicate the largest amount of information in the data to the learner without error. In this work the theory of approximation set coding and generalization capacity is explored in order to understand this approach. We then apply the generalization capacity criterion to the simplified sparse mean estimation problem and detail an importance sampling algorithm which at once solves the difficulty posed by large hypothesis spaces and the slow convergence of uniform sampling algorithms.

主题：	应用 (stat.AP) ; 信息论 (cs.IT)
引用方式：	arXiv:1704.00575 [stat.AP]
	(或者 arXiv:1704.00575v1 [stat.AP] 对于此版本)
	https://doi.org/10.48550/arXiv.1704.00575

提交历史

来自： Emiliano Diaz [查看电子邮件]
[v1] 星期一， 2017 年 4 月 3 日 13:35:17 UTC (2,733 KB)

统计学 > 应用

标题：稀疏均值定位的信息论方法

标题： Sparse mean localization by information theory

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 应用

标题： 稀疏均值定位的信息论方法 显示英文标题

标题： Sparse mean localization by information theory

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：稀疏均值定位的信息论方法