I once tried the Transformer by the scripts: https://github.com/OpenNMT/OpenNMT-tf/tree/master/scripts/wmt
and the training parameters in the OpenNMT-py FAQ section.
But I also noticed that the Tensorflow implementation: https://github.com/tensorflow/models/tree/master/official/transformer shares the vocab embedding matrix in both encoder and decoder, this is also mentioned in blog “The Annotated Transformer”. While OpenNMT-py seems not (??), does anyone know the difference?