Learning sparse transformations through backpropagation

Bloem, Peter

统计学 > 机器学习

arXiv:1810.09184 (stat)

[提交于 2018年10月22日 ]

标题：通过反向传播学习稀疏变换

标题： Learning sparse transformations through backpropagation

Authors:Peter Bloem

摘要：深度学习架构中的许多变换都是稀疏连接的。当这种变换无法手动设计时，可以通过纯反向传播进行学习，例如在注意力机制中。然而，在学习过程中，这些稀疏结构通常以密集形式表示，因为我们事先不知道哪些元素最终会变为非零。我们引入了自适应稀疏超层，这是一种参数化稀疏的学习稀疏变换的方法：即通过带有相关值的索引元组来表示。为了克服这种离散结构带来的梯度缺失，我们引入了一种随机采样连接的方法，并在随机连接的计算图上进行反向传播。为了证明这种方法可以使模型在真实数据上达到竞争性的性能，我们用它构建了两种架构。首先，一种用于视觉分类的注意力机制。其次，我们实现了一种可微排序方法：具体来说，是在仅知道正确顺序的情况下学习对未标记的MNIST数字进行排序。

摘要： Many transformations in deep learning architectures are sparsely connected. When such transformations cannot be designed by hand, they can be learned, even through plain backpropagation, for instance in attention mechanisms. However, during learning, such sparse structures are often represented in a dense form, as we do not know beforehand which elements will eventually become non-zero. We introduce the adaptive, sparse hyperlayer, a method for learning a sparse transformation, paramatrized sparsely: as index-tuples with associated values. To overcome the lack of gradients from such a discrete structure, we introduce a method of randomly sampling connections, and backpropagating over the randomly wired computation graph. To show that this approach allows us to train a model to competitive performance on real data, we use it to build two architectures. First, an attention mechanism for visual classification. Second, we implement a method for differentiable sorting: specifically, learning to sort unlabeled MNIST digits, given only the correct order.

主题：	机器学习 (stat.ML) ; 机器学习 (cs.LG)
引用方式：	arXiv:1810.09184 [stat.ML]
	(或者 arXiv:1810.09184v1 [stat.ML] 对于此版本)
	https://doi.org/10.48550/arXiv.1810.09184

提交历史

来自： Peter Bloem [查看电子邮件]
[v1] 星期一， 2018 年 10 月 22 日 11:34:32 UTC (5,319 KB)

统计学 > 机器学习

标题：通过反向传播学习稀疏变换

标题： Learning sparse transformations through backpropagation

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 机器学习

标题： 通过反向传播学习稀疏变换 显示英文标题

标题： Learning sparse transformations through backpropagation

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：通过反向传播学习稀疏变换