Translation outputs sentences with repeated phrases

kargintima · December 13, 2019, 10:58am

Hello!
I am trying to develop a NMT system for en->ru.
RNN, token=word, 1m lines dataset
But I see that sometimes (too often for a random mistake) system outputs sentence with repeated phrase:
What has the election taught us in London ? -> В лондоне нас учили в лондоне ?
(in London taught us in London)
But at the same time “What has the election taught us ?” translates pretty well.
What could it be?

guillaumekln · December 17, 2019, 2:16pm

Hi,

This usually happens on unexpected data, either because the model was not trained long enough or there is not enough training data.

kargintima · December 17, 2019, 2:48pm

Thanks.
So, I have to try:
Find more data
increase train_steps
maybe increase complexity of NN architecture? Increase number of layers or something…