ARTICLE: Recurrent Neural Networks - how do they work?

Marco Singh
16. nov. 2016
1 min læsning

Recurrent Neural Networks (RNN) in different shapes have been very succesful in recent times. The model captures long-term dependencies, which is very desirable in many applications. Microsoft researchers recently used a combination of Convolutional Neural Networks and LSTM's (a special version of a RNN) to reach human parity in conversational speech, and many more applications are seeing the light. This article explores the standard RNN and I dig into every mathematical detail of the network structure and in particular the Backpropagation Through Time (BPTT) algorithm. I haven't seen any paper going into every computation of the derivatives, hence this is what I wanted to achieve. The paper builds the foundation for the further understanding of the LSTM model. Reading it does require prior knowledge of a feedforward Neural Network and the backpropagation in this setup (see my previous article in case you haven't familiarized yourself with it) and basic calculus. CLICK ON THE PICTURE BELOW TO READ THE ARTICLE