How to deal with repetition?


(Maplewizard) #1

I found that the output result always has sentences that just repeat one or few words once and once again. Is there any parameter that can prevent such problem?


How to handle the 'repeat pattern' in the output
(Guillaume Klein) #2

Currently, there is no mechanism to prevent this. It is usually mitigated by using more data and bigger models.

However, we now support filters over hypotheses during decoding. It should be quite easy to add a filter to ignore sequences with too many repetitions.

For reference:


(Maplewizard) #3

Thanks very much for your help. Do you mean that when I update to the current version, the filter will be automatically turned on? Or I should set some parameter of the program?


(Guillaume Klein) #4

The filter you care about (ignoring sentences with too many repetitions) is not yet implemented. But we could easily add it in the future. Stay tuned!


(Maplewizard) #5

Are there any issues in github related to this problem? I want to keep watch on the issue, thereby knowing the break through as soon as you have finished that.


(Maplewizard) #6

Is it possible to impart some penalty on repeating words. I guess, penalty is better than just filter out.


(jean.senellart) #7

Hello - for information - I am playing with length normalization and coverage penalty as described in https://arxiv.org/pdf/1609.08144.pdf in beam search - the latter should reduce the effect. I will put some results in this thread when I am done.

Also - note that adequacy between your training data and your translation requests is important. In our first release, we had only used full sentence to train our models, which taught the models that all translation should be complete sentence, and it was generating a lot of repetition when translating fragment of sentences. Just introducing aligned fragment of sentences in training data reduced a lot this issue.


Alternative methods for <UNK> substitution
(Maplewizard) #8

@jean.senellart Thanks for your reply. Is aligned fragment the same as the “alignment model” mentioned in your work? According to my understanding, it replaces the unknown words with aligned one when predicting. Is this method that alleviate the problem of repetition?