Paragraph-based training with LSTM Network

tuankstn · November 14, 2020, 11:22pm

I have pairs of source and target documents for building a neural machine translation with LSTM network.

However, in most of the pairs, the number of sentences in source is not the same as in target, but the numbers of paragraphs are the same.

If I use sentence-based training, the preprocessing step to align the source and target sentences will be time consuming.

Is it possible to do paragraph-based training with an LSTM?

Thank you.