Transformer predict empty line

DaisyTung · March 7, 2021, 10:29am

Hello everyone!

I am using OpenNMT Transformer model for summarization task, and I have modified multi_headed_attn.py module for my research. I just add one more linear layer in multi_headed_attn.py.

It can normally train and predict. However, it will predict some empty lines. Most of outputs are fine, but few of outputs are empty. The inputs are all normal sentence, not empty.

I found that there is a similar problem but without answer. (Transformer model is generating empty lines, when using Sentencepiece Model)
It seems that he didn’t modify the openNMT module but still face this problem.

Does someone have any idea why this happen?
Thank you!

guillaumekln · March 11, 2021, 2:32pm

Hi,

This can happen when the input is not correctly preprocessed, or is very different from anything the model has seen during the training.

As suggested by @francoishernandez in No words predicted at inference - #2 by francoishernandez, you can try setting min_length to force the prediction of at least one token.

DaisyTung · March 18, 2021, 9:54am

Thank you! It is useful for me.
After Setting [min_length], it won’t predict empty line anymore.
The average ROUGE score of generated summary is a little lower than before(the average ROUGE score computed after delete empty line). But not too bad!