Learning and Generalization in RNNs

Panigrahi, Abhishek; Goyal, Navin

Computer Science > Machine Learning

arXiv:2106.00047 (cs)

[Submitted on 31 May 2021 ]

Title: Learning and Generalization in RNNs

Title: 循环神经网络中的学习与泛化

Authors:Abhishek Panigrahi, Navin Goyal

Abstract: Simple recurrent neural networks (RNNs) and their more advanced cousins LSTMs etc. have been very successful in sequence modeling. Their theoretical understanding, however, is lacking and has not kept pace with the progress for feedforward networks, where a reasonably complete understanding in the special case of highly overparametrized one-hidden-layer networks has emerged. In this paper, we make progress towards remedying this situation by proving that RNNs can learn functions of sequences. In contrast to the previous work that could only deal with functions of sequences that are sums of functions of individual tokens in the sequence, we allow general functions. Conceptually and technically, we introduce new ideas which enable us to extract information from the hidden state of the RNN in our proofs -- addressing a crucial weakness in previous work. We illustrate our results on some regular language recognition problems.

Abstract: 简单的循环神经网络（RNNs）及其更高级的变体如LSTMs等在序列建模方面非常成功。然而，它们的理论理解仍然不足，未能跟上前馈网络的进步。在前馈网络中，特别是在高度过参数化的单隐藏层网络的特殊情况下，已经出现了相对完整的理解。在本文中，我们通过证明RNN可以学习序列函数来弥补这一情况。与之前只能处理序列中各个标记函数之和的函数的工作不同，我们允许一般的函数。在概念和技术上，我们引入了新的想法，使我们能够在证明中从RNN的隐藏状态中提取信息——解决了之前工作的关键弱点。我们在一些正则语言识别问题上展示了我们的结果。

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2106.00047 [cs.LG]
	(or arXiv:2106.00047v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2106.00047

Submission history

From: Abhishek Panigrahi [view email]
[v1] Mon, 31 May 2021 18:27:51 UTC (2,645 KB)

Computer Science > Machine Learning

Title: Learning and Generalization in RNNs

Title: 循环神经网络中的学习与泛化

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title: Learning and Generalization in RNNs Show Chinese title

Title: 循环神经网络中的学习与泛化

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Learning and Generalization in RNNs