OpenNMT Forum

Transformer-XL With OpenNMT

Is there a way to use the Transformer-XL architecture for translation using OpenNMT? It seems like it would be a substantial improvement over the transformer model due to its ability to model longer term dependencies but I didn’t see anyone implement it for usage with machine translation directly.

1 Like

Usually NMT-Systems only translate single sentences. In such cases the sequence limitation doesn’t disturb the encoding. For pretraining the seq2seq architectures like MASS are better suited.
Anways there are XL-implementations for pytorch and tensorflow.