Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models

Diakonikolas, Ilias; Kane, Daniel M.

计算机科学 > 机器学习

arXiv:2012.07774 (cs)

[提交于 2020年12月14日 ]

标题：多项式近零集的小覆盖和潜在变量模型的学习

标题： Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models

Authors:Ilias Diakonikolas, Daniel M. Kane

摘要：设$V$为任意一个多元次数为$d$的齐次多项式的向量空间，其余维数至多为$k$，并且$S$为$V$ {\em 接近} 中所有多项式都为零的点的集合。我们建立了在$\ell_2$-范数下，$\epsilon$-覆盖对于$S$的定性最优上界。大致来说，我们证明存在一个$\epsilon$-覆盖对于$S$，其基数为$M = (k/\epsilon)^{O_d(k^{1/d})}$。我们的结果是构造性的，产生了一个计算这样的$\epsilon$-cover 的算法，该算法在时间$\mathrm{poly}(M)$内运行。基于我们的结构结果，我们获得了几个基本的高维概率模型在隐藏变量情况下的显著改进的学习算法。这些包括球面高斯的$k$混合模型的密度和参数估计（具有已知的共同协方差），在高斯分布下具有$k$个隐藏单元的一层隐藏层 ReLU 网络的 PAC 学习，线性回归的$k$混合模型的密度和参数估计（具有高斯协变量），以及超平面的$k$混合模型的参数估计。我们的算法在参数$k$下的运行时间为{\em 拟多项式}。这些问题的先前算法的运行时间在$k^{\Omega(1)}$上是指数级的。从高层次来看，我们针对所有这些学习问题的算法工作方式如下：通过计算隐藏参数的低阶矩，我们能够找到一个多项式向量空间，在未知参数上几乎消失。我们的结构结果使我们能够为隐藏参数的集合计算一个准多项式大小的覆盖，我们在学习算法中利用了这一点。

摘要： Let $V$ be any vector space of multivariate degree-$d$ homogeneous polynomials with co-dimension at most $k$, and $S$ be the set of points where all polynomials in $V$ {\em nearly} vanish. We establish a qualitatively optimal upper bound on the size of $\epsilon$-covers for $S$, in the $\ell_2$-norm. Roughly speaking, we show that there exists an $\epsilon$-cover for $S$ of cardinality $M = (k/\epsilon)^{O_d(k^{1/d})}$. Our result is constructive yielding an algorithm to compute such an $\epsilon$-cover that runs in time $\mathrm{poly}(M)$. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models with hidden variables. These include density and parameter estimation for $k$-mixtures of spherical Gaussians (with known common covariance), PAC learning one-hidden-layer ReLU networks with $k$ hidden units (under the Gaussian distribution), density and parameter estimation for $k$-mixtures of linear regressions (with Gaussian covariates), and parameter estimation for $k$-mixtures of hyperplanes. Our algorithms run in time {\em quasi-polynomial} in the parameter $k$. Previous algorithms for these problems had running times exponential in $k^{\Omega(1)}$. At a high-level our algorithms for all these learning problems work as follows: By computing the low-degree moments of the hidden parameters, we are able to find a vector space of polynomials that nearly vanish on the unknown parameters. Our structural result allows us to compute a quasi-polynomial sized cover for the set of hidden parameters, which we exploit in our learning algorithms.

评论：	FOCS'20论文的完整版本
主题：	机器学习 (cs.LG) ; 计算复杂性 (cs.CC); 数据结构与算法 (cs.DS); 代数几何 (math.AG); 统计理论 (math.ST)
引用方式：	arXiv:2012.07774 [cs.LG]
	(或者 arXiv:2012.07774v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2012.07774

提交历史

来自： Ilias Diakonikolas [查看电子邮件]
[v1] 星期一， 2020 年 12 月 14 日 18:14:08 UTC (70 KB)

计算机科学 > 机器学习

标题：多项式近零集的小覆盖和潜在变量模型的学习

标题： Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 多项式近零集的小覆盖和潜在变量模型的学习 显示英文标题

标题： Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：多项式近零集的小覆盖和潜在变量模型的学习