Use CuDNN RNN for faster training

(jean.senellart) #1

was in the initial codebase but was removed because it does not work with input feeding, and could not be easily converted to CPU model.

introduce the feature back in encoder, while keeping CPU possible serialization and support in CTranslate.

(srush) #2

Note though that this did not make too much of a speed difference (see Adam’s email)