Interpreting Latent Variables in Factor Models via Convex Optimization

Taeb, Armeen; Chandrasekaran, Venkat

doi:10.1007/s10107-017-1187-7

统计学 > 方法论

arXiv:1601.00389 (stat)

[提交于 2016年1月4日 (v1) ，最后修订 2016年11月3日 (此版本， v2)]

标题：通过凸优化解释因子模型中的潜在变量

标题： Interpreting Latent Variables in Factor Models via Convex Optimization

Authors:Armeen Taeb, Venkat Chandrasekaran

摘要：潜变量或未观测现象在数据分析中带来了显著的困难，因为它们会在一组可观测变量之间诱导出复杂的且相互混淆的依赖关系。因子分析是一种著名的多变量统计建模方法，通过识别少量潜变量对可观测变量的影响来应对这一挑战。然而，因子模型中的潜变量仅仅是源自可观测现象的纯数学对象，并没有与之相关的解释性信息。一种自然的方法是为因子模型中的潜变量赋予语义信息，即获取一些可能有用的额外协变量的测量值（这些协变量可能与原始可观测变量集相关），并将这些辅助协变量与潜变量关联起来。本文描述了一种系统性的方法来识别这种关联。我们的方法基于求解计算上易于处理的凸优化问题，可以看作是通过凸优化拟合因子模型的最小迹因子分析程序的推广。我们在高维设置下分析了我们方法的理论一致性，并通过真实数据的实验演示展示了其实用性。

摘要： Latent or unobserved phenomena pose a significant difficulty in data analysis as they induce complicated and confounding dependencies among a collection of observed variables. Factor analysis is a prominent multivariate statistical modeling approach that addresses this challenge by identifying the effects of (a small number of) latent variables on a set of observed variables. However, the latent variables in a factor model are purely mathematical objects that are derived from the observed phenomena, and they do not have any interpretation associated to them. A natural approach for attributing semantic information to the latent variables in a factor model is to obtain measurements of some additional plausibly useful covariates that may be related to the original set of observed variables, and to associate these auxiliary covariates to the latent variables. In this paper, we describe a systematic approach for identifying such associations. Our method is based on solving computationally tractable convex optimization problems, and it can be viewed as a generalization of the minimum-trace factor analysis procedure for fitting factor models via convex optimization. We analyze the theoretical consistency of our approach in a high-dimensional setting as well as its utility in practice via experimental demonstrations with real data.

主题：	方法论 (stat.ME) ; 优化与控制 (math.OC); 统计理论 (math.ST)
引用方式：	arXiv:1601.00389 [stat.ME]
	(或者 arXiv:1601.00389v2 [stat.ME] 对于此版本)
	https://doi.org/10.48550/arXiv.1601.00389
期刊参考：	Mathematical Programming 2018, Vol. 167, 129--154
相关 DOI:	https://doi.org/10.1007/s10107-017-1187-7

提交历史

来自： Armeen Taeb [查看电子邮件]
[v1] 星期一， 2016 年 1 月 4 日 06:29:16 UTC (51 KB)
[v2] 星期四， 2016 年 11 月 3 日 01:18:57 UTC (74 KB)

统计学 > 方法论

标题：通过凸优化解释因子模型中的潜在变量

标题： Interpreting Latent Variables in Factor Models via Convex Optimization

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 方法论

标题： 通过凸优化解释因子模型中的潜在变量 显示英文标题

标题： Interpreting Latent Variables in Factor Models via Convex Optimization

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：通过凸优化解释因子模型中的潜在变量