Tensor-Tensor Products for Optimal Representation and Compression

Kilmer, Misha; Horesh, Lior; Avron, Haim; Newman, Elizabeth

数学 > 数值分析

arXiv:2001.00046v1 (math)

[提交于 2019年12月31日 ]

标题：张量-张量乘积用于最优表示和压缩

标题： Tensor-Tensor Products for Optimal Representation and Compression

Authors:Misha Kilmer, Lior Horesh, Haim Avron, Elizabeth Newman

摘要：在大数据、数据分析和机器学习的时代，必须找到压缩大型数据集的方法，以确保后续分析所需的基本特征不会丢失。传统上用于数据降维和特征提取的是矩阵奇异值分解（SVD），它假定数据已经以矩阵格式排列。本研究的主要目标是证明，当高维数据集被当作张量（即多维数组）处理，并通过张量-奇异值分解（SVD）在张量-张量乘积结构下进行压缩时，这些数据集更具可压缩性。（Kilmer和Martin，2011；Kernfeld等，2015）。我们首先在两种不同的截断策略下证明了张量-SVD族的Eckart-Young最优性结果。由于这种最优性性质可以在基于矩阵和张量的代数中得到证明，一个基本问题浮现：在表示效率方面，张量结构是否涵盖了矩阵结构？答案是肯定的，当我们证明相同维度的张量-张量表示可以优于其矩阵对应项时，这一点得到了证明。然后，我们研究了截断张量-SVD提供的压缩表示在理论上和压缩性能上与最近的基于张量的类似方法——截断高阶奇异值分解（HOSVD）（De Lathauwer等，2000；De Lathauwer和Vandewalle，2004）之间的关系，从而展示了我们基于张量的算法的潜在优势。最后，我们提出了新的截断张量SVD变体，即多路张量SVD，提供了进一步的近似表示效率，并讨论了它们在何种条件下被认为是最佳的。最后，我们通过一个数值研究展示了该理论的实用性。

摘要： In this era of big data, data analytics and machine learning, it is imperative to find ways to compress large data sets such that intrinsic features necessary for subsequent analysis are not lost. The traditional workhorse for data dimensionality reduction and feature extraction has been the matrix SVD, which presupposes that the data has been arranged in matrix format. Our main goal in this study is to show that high-dimensional data sets are more compressible when treated as tensors (aka multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product structures in (Kilmer and Martin, 2011; Kernfeld et al., 2015). We begin by proving Eckart Young optimality results for families of tensor-SVDs under two different truncation strategies. As such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is yes, as shown when we prove that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then investigate how the compressed representation provided by the truncated tensor-SVD is related both theoretically and in compression performance to its closest tensor-based analogue, truncated HOSVD (De Lathauwer et al., 2000; De Lathauwer and Vandewalle, 2004), thereby showing the potential advantages of our tensor-based algorithms. Finally, we propose new tensor truncated SVD variants, namely multi-way tensor SVDs, provide further approximated representation efficiency and discuss under which conditions they are considered optimal. We conclude with a numerical study demonstrating the utility of the theory.

评论：	27页，8图，3表
主题：	数值分析 (math.NA)
MSC 类：	15A69, 65F99, 94A08
引用方式：	arXiv:2001.00046 [math.NA]
	(或者 arXiv:2001.00046v1 [math.NA] 对于此版本)
	https://doi.org/10.48550/arXiv.2001.00046

提交历史

来自： Elizabeth Newman [查看电子邮件]
[v1] 星期二， 2019 年 12 月 31 日 19:35:02 UTC (824 KB)

数学 > 数值分析

标题：张量-张量乘积用于最优表示和压缩

标题： Tensor-Tensor Products for Optimal Representation and Compression

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

数学 > 数值分析

标题： 张量-张量乘积用于最优表示和压缩 显示英文标题

标题： Tensor-Tensor Products for Optimal Representation and Compression

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：张量-张量乘积用于最优表示和压缩