Effect of disabling bucketing


What would be the effect of disabling bucketing ?

I guess it would make training slower, since every sentence would have more tokens because of padding

But would it lower performance in term of BLEU score for example ?

Also, am I correct assuming that each sentence will be padded until it reaches the max length of its batch ?

Answer for this question

After testing, each sentence is padded accordingly to the max length of its batch. Also it is independant for target and source

(talking about opennmt.utils.data. training_pipeline)



Yes, slower in terms of tokens per second.

Most likely not. There is small difference in the loss calculation that is averaged across less tokens but the impact at the end is probably negligible.

1 Like