How to deal with repetition?

maplewizard · February 8, 2017, 3:00am

I found that the output result always has sentences that just repeat one or few words once and once again. Is there any parameter that can prevent such problem?

guillaumekln · February 8, 2017, 9:07am

Currently, there is no mechanism to prevent this. It is usually mitigated by using more data and bigger models.

However, we now support filters over hypotheses during decoding. It should be quite easy to add a filter to ignore sequences with too many repetitions.

For reference:

github.com

OpenNMT/OpenNMT/blob/master/onmt/translate/DecoderAdvancer.lua#L144-L160


* `scores` - a 2D tensor of size `(batchSize * beamSize, numTokens)`.


]]
function DecoderAdvancer:expand(beam)
local state = beam:getState()
local decOut = state[2]
local out = self.decoder.generator:forward(decOut)
local features = {}
for j = 2, #out do
  local _, best = out[j]:max(2)
  features[j - 1] = best:view(-1)
end
state[5] = features
local scores = out[1]


if self.lmModel then
  local lmOut = self.lmModel.generator:forward(state[10])

maplewizard · February 8, 2017, 9:18am

Thanks very much for your help. Do you mean that when I update to the current version, the filter will be automatically turned on? Or I should set some parameter of the program?

guillaumekln · February 8, 2017, 9:21am

The filter you care about (ignoring sentences with too many repetitions) is not yet implemented. But we could easily add it in the future. Stay tuned!

maplewizard · February 8, 2017, 9:24am

Are there any issues in github related to this problem? I want to keep watch on the issue, thereby knowing the break through as soon as you have finished that.

maplewizard · February 8, 2017, 9:41am

Is it possible to impart some penalty on repeating words. I guess, penalty is better than just filter out.

jean.senellart · February 8, 2017, 10:20am

Hello - for information - I am playing with length normalization and coverage penalty as described in https://arxiv.org/pdf/1609.08144.pdf in beam search - the latter should reduce the effect. I will put some results in this thread when I am done.

Also - note that adequacy between your training data and your translation requests is important. In our first release, we had only used full sentence to train our models, which taught the models that all translation should be complete sentence, and it was generating a lot of repetition when translating fragment of sentences. Just introducing aligned fragment of sentences in training data reduced a lot this issue.

maplewizard · February 9, 2017, 6:28am

@jean.senellart Thanks for your reply. Is aligned fragment the same as the “alignment model” mentioned in your work? According to my understanding, it replaces the unknown words with aligned one when predicting. Is this method that alleviate the problem of repetition?

GokulNC · September 15, 2021, 3:06am

Came across this interesting paper: