Empty lines in validation set / training failes / LSTM but not Transformer

dimitarsh1 · March 25, 2019, 4:07pm

Hello all,

I have trained several Transformer models on Europarl data.
When training on the same data an LSTM model it fails on the validation step with the following error:

RuntimeError: Length of all samples has to be greater than 0, but found an element in 'lengths' that is <= 0

According to this post: https://github.com/OpenNMT/OpenNMT-py/issues/1342 it is due to the fact that some sentences in the validation set are empty. However, it is exactly the same as with the Transformer.

Any idea how to mitigate this issue? Could the -filter_valid option of the preprocess.py be a suitable way to solve the problem?

Thank you,
Kind regards,
Dimitar

guillaumekln · March 28, 2019, 4:34pm

Hi,

Did you try it?

dimitarsh1 · March 31, 2019, 11:33am

Yes,

Indeed that helps. It filters the empty lines and doesn’t cause the problem.

It was a redundant post I guess, sorry about that.

Cheers,
Dimitar