Hi there has been some recent results using simplified models with only attention (e.g. Attention Is All You Need, https://arxiv.org/pdf/1706.03762.pdf ), giving good results with faster training times. Does OpenNMT plan to support these sorts of models?
We have no plan to support these models in OpenNMT (Lua version).
However, they are implemented in the PyTorch version. Take a look at the command line options there to use such models.