Attention Model

Where one RNN reads in a sentence and then different one outputs a sentence. Theres a modification to this called the Attention model. The attention algorithm, the attention idea has one of the most influential ideas in deep learning.

Lets assume you have an input sentence and you use a bidirectional RNN, or bidirectional GRU, or bidirectional LSTM to compute features on every word. In practice, GRUs and LSTMs are often used for this, with maybe LSTMs be more common.