Published on Fri Oct 11 2019

Deep Independently Recurrent Neural Network (IndRNN)

Shuai Li, Wanqing Li, Chris Cook, Yanbo Gao

Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems. This paper proposes a new type of RNNs with the recurrent connection formulated as the Hadamard product. This is referred to as independently recurrent neural network (IndRNN)

0
0
0
Abstract

Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems and thus difficult to learn long-term patterns and construct deep networks. To address these problems, this paper proposes a new type of RNNs with the recurrent connection formulated as Hadamard product, referred to as independently recurrent neural network (IndRNN), where neurons in the same layer are independent of each other and connected across layers. Due to the better behaved gradient backpropagation, IndRNN with regulated recurrent weights effectively addresses the gradient vanishing and exploding problems and thus long-term dependencies can be learned. Moreover, an IndRNN can work with non-saturated activation functions such as ReLU (rectified linear unit) and be still trained robustly. Different deeper IndRNN architectures, including the basic stacked IndRNN, residual IndRNN and densely connected IndRNN, have been investigated, all of which can be much deeper than the existing RNNs. Furthermore, IndRNN reduces the computation at each time step and can be over 10 times faster than the commonly used Long short-term memory (LSTM). Experimental results have shown that the proposed IndRNN is able to process very long sequences and construct very deep networks. Better performance has been achieved on various tasks with IndRNNs compared with the traditional RNN, LSTM and the popular Transformer.

Tue Mar 13 2018
Machine Learning
Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
Recurrent neural networks (RNNs) have been widely used for processing sequential data. RNNs are commonly difficult to train due to the well-known gradient vanishing and exploding problems and hard to learn long-term patterns. To address these problems, a new type of RNN, referred to as independently recurrent neural network (IndRNN), is proposed.
0
0
0
Mon Apr 11 2016
Neural Networks
Deep Gate Recurrent Neural Network
Simple Gated Unit and Deep Simple Gated Unit are general structures for learning long term dependencies. Both structures require fewer parameters and less computation time in sequence classification tasks. Unlike GRU and LSTM, which require more than one gates to control information flow in the network, SGU and
0
0
0
Thu Oct 05 2017
Artificial Intelligence
Dilated Recurrent Neural Networks
Learning with recurrent neural networks (RNNs) on long sequences is a difficult task. This paper introduces a simple yet effective RNN structure, the DilatedRNN. The proposed architecture is characterized by multi-resolution recurrent skip connections.
0
0
0
Thu Jan 18 2018
Neural Networks
Overcoming the vanishing gradient problem in plain recurrent networks
Plain recurrent networks greatly suffer from the vanishing gradient problem. Gated Neural Networks (GNNs) deliver promising results in many sequence learning tasks. This paper shows how we can address this problem in a plain recurrent network by analyzing the gating mechanisms in GNNs.
0
0
0
Thu May 28 2020
Neural Networks
Learning Various Length Dependence by Dual Recurrent Neural Networks
Recurrent neural networks (RNNs) are widely used as a memory model for sequence-related problems. Many variants of RNN have been proposed to solve the problems of training RNNs and process long sequences. Although some classical models have been. proposed, capturing long-term dependence while.responding to short-term changes remains a challenge.
0
0
0
Fri Mar 22 2013
Neural Networks
Speech Recognition with Deep Recurrent Neural Networks
Recurrent neural networks (RNNs) are a powerful model for sequential data. When trained end-to-end with regularisation, deep Long Short-term Memory RNNs achieve 17.7% on the TIMIT phoneme recognition benchmark.
0
0
0