Least Absolute Gradient Selector: Statistical Regression via Pseudo-Hard Thresholding

Yang, Kun; Hastie, Trevor

统计学 > 机器学习

arXiv:1204.2353v1 (stat)

[提交于 2012年4月11日 (此版本) ， 最新版本 2012年10月19日 (v4) ]

标题：最小绝对梯度选择器：通过伪硬阈值的统计回归

标题： Least Absolute Gradient Selector: Statistical Regression via Pseudo-Hard Thresholding

Authors:Kun Yang, Trevor Hastie

摘要：变量选择在线性模型中在现代统计学中起着关键作用。硬阈值方法，如$l_0$正则化，在理论上是理想的但计算上不可行。在本文中，我们提出了一种新方法，称为\textbf{滞后}，即“最小绝对梯度选择器”，通过模仿$l_0$正则化的离散选择过程来解决这个具有挑战性且有趣的问题。为了在噪声影响下估计$\beta$，我们仍然考虑以下凸程序 \[\hat{\beta} = \textrm{arg min}\frac{1}{n}\|X^{T}(y - X\beta)\|_1 + \lambda_n\sum_{i = 1}^pw_i(y;X;n)|\beta_i|\] $\lambda_n > 0$控制稀疏性，$w_i > 0$依赖于$y, X$和 $n$是不同$\beta_i$的权重；$n$是样本量。令人惊讶的是，我们在论文中将通过几何和分析方法展示，LAGS具有两个吸引人的特性：(1) LAGS在选择离散变量和硬阈值属性方面表现出与$l_0$正则化相似的行为，这是通过精心选择的$w_i$实现的，我们称这一特性为\emph{“伪硬阈值”}；(2) 渐近地，LAGS是一致的，并能够发现真实模型；非渐近地，LAGS能够识别模型中的稀疏性，系数的预测误差在噪声水平上被限制，最多是一个对数因子---$\log p$，其中$p$是预测变量的数量。在计算方面，由于凸性，LAGS可以通过凸程序方法高效求解，或者在将其重新表述为线性规划后通过单纯形算法求解。数值模拟显示，在均方误差和模型简洁性方面，LAGS优于软阈值方法。

摘要： Variable selection in linear models plays a pivotal role in modern statistics. Hard-thresholding methods such as $l_0$ regularization are theoretically ideal but computationally infeasible. In this paper, we propose a new approach, called the \textbf{LAGS}, short for "least absulute gradient selector", to this challenging yet interesting problem by mimicking the discrete selection process of $l_0$ regularization. To estimate $\beta$ under the influence of noise, we consider, nevertheless, the following convex program \[\hat{\beta} = \textrm{arg min}\frac{1}{n}\|X^{T}(y - X\beta)\|_1 + \lambda_n\sum_{i = 1}^pw_i(y;X;n)|\beta_i|\] $\lambda_n > 0$ controls the sparsity and $w_i > 0$ dependent on $y, X$ and $n$ is the weights on different $\beta_i$; $n$ is the sample size. Surprisingly, we shall show in the paper, both geometrically and analytically, that LAGS enjoys two attractive properties: (1) LAGS demonstrates discrete selection behavior and hard thresholding property as $l_0$ regularization by strategically chosen $w_i$, we call this property \emph{"pseudo-hard thresholding"}; (2) Asymptotically, LAGS is consistent and capable of discovering the true model; nonasymptotically, LAGS is capable of identifying the sparsity in the model and the prediction error of the coefficients is bounded at the noise level up to a logarithmic factor---$\log p$, where $p$ is the number of predictors. Computationally, LAGS can be solved efficiently by convex program routines for its convexity or by simplex algorithm after recasting it into a linear program. The numeric simulation shows that LAGS is superior compared to soft-thresholding methods in terms of mean squared error and parsimony of the model.

评论：	变量选择，伪硬阈值化
主题：	机器学习 (stat.ML) ; 应用 (stat.AP); 方法论 (stat.ME)
引用方式：	arXiv:1204.2353 [stat.ML]
	(或者 arXiv:1204.2353v1 [stat.ML] 对于此版本)
	https://doi.org/10.48550/arXiv.1204.2353

提交历史

来自： Kun Yang [查看电子邮件]
[v1] 星期三， 2012 年 4 月 11 日 06:57:39 UTC (104 KB)
[v2] 星期四， 2012 年 4 月 12 日 05:28:28 UTC (104 KB)
[v3] 星期六， 2012 年 4 月 14 日 23:52:09 UTC (104 KB)
[v4] 星期五， 2012 年 10 月 19 日 03:56:01 UTC (104 KB)

统计学 > 机器学习

标题：最小绝对梯度选择器：通过伪硬阈值的统计回归

标题： Least Absolute Gradient Selector: Statistical Regression via Pseudo-Hard Thresholding

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 机器学习

标题： 最小绝对梯度选择器：通过伪硬阈值的统计回归 显示英文标题

标题： Least Absolute Gradient Selector: Statistical Regression via Pseudo-Hard Thresholding

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：最小绝对梯度选择器：通过伪硬阈值的统计回归