Were RNNs All We Needed? Revisits RNNs and shows that by removing the hidden states from input, forget, and update gates RNNs can be efficiently trained in parallel. This is possible because with this change architectures like LSTMs and GRUs no longer require backpropagate… https://t.co/02uILwm6wO
"Were RNNs All We Needed?" https://t.co/vHsQyQbCDE Mr. @SchmidhuberAI has been championing this for decades, and I’ve been in this camp since I trained my first RNN in 2015. Nice to see that RNN's are coming back with a vengeance.
As an RNN researcher, you couldn’t help but feel left behind 10 years ago as more parallelizable architectures dominated. Then transformers arrived and were extraordinary - ending any debate - but what they captured in rich recurrent dynamics, they gave up in the online… https://t.co/CVDBfBPaqX
Recent discussions in the AI research community have revisited the potential of Recurrent Neural Networks (RNNs), particularly focusing on their efficiency and training capabilities. A paper titled 'Were RNNs All We Needed?' by researchers L Feng, F Tung, M O Ahmed, and Y Bengio from Mila and Borealis AI explores the impact of removing recurrent connections in RNNs, such as GRUs and LSTMs. This modification allows for training parallelism and recurrent inference, potentially enhancing their performance. The research suggests that RNNs can be efficiently trained in parallel by eliminating hidden states from input, forget, and update gates. This development is seen as a significant step in AI, as it challenges the dominance of transformer architectures, which have been the preferred choice for their parallelizable nature. The study has sparked renewed interest in RNNs, with some researchers advocating for their resurgence in AI applications.