Did someone test RNN with larger recurrence?

Did someone tested a kind of Convolutional RNN ? = RNN with K of its previous states as input.
It would perhaps better handle local patterns.
Lower layer sizes should be certainly used to avoid number of weights explosion.

Yes, someone is experimenting with a CNN-based encoder similar to:

https://arxiv.org/abs/1611.02344

A work in progress.

Isn’t fairseq a pure CNN, without recurrence ?

Correct, that’s not exactly what you are referring but it is a first take on convolutions.

Is the idea from a paper?

Just a personal guess… with few time steps in the recurrent part of a RNN, it would possibly handle in a better way local patterns. On the other hand convolutional-only networks would handle patterns, but would possibly miss a time-sequence effect.

From a theoretical standpoint, a very deep CNN with character-based rather than word-based encoding would be really interesting, particularly for translation to/from morphologically rich languages. I’m tempted to believe that with the right setup, the system could learn things like compounding/de-compounding, agglutination, and morphological affixes quite nicely.

1 Like