Seq2seq pretraining

This paper looks relatively simple to implement, claims strong MT results

UNSUPERVISED PRETRAINING FOR SEQUENCE TO SEQUENCE LEARNING