Maybe it could be good to pin this thread and amend / checkmark if needed.
This is mainly done in comparison to the SOA of the community.
-
GNMT attention (cf https://github.com/tensorflow/nmt) seems to bring +1 BLEU
-
RNN “Deep Architecture” from Rico Sennrich & co https://arxiv.org/pdf/1707.07631.pdf
-
Label Smoothing
-
Multi-gpu which works and brings linear speed. cf https://github.com/OpenNMT/OpenNMT-tf/pull/54 requires in-graph replication.
-
Cuda 9 compatibility ===> DONE
-
Token batch / long sequence OOM prevention / Automatic memory management (batch length) ===> DONE
-
Confidence score on translation
-
transformer … but OpenNMT-TF will do the job. ===> onmt-tf DONE