I am trying to train a transformer model for Italian-English using a parallel corpus. I am training on ~1M sentences and validating on ~50K sentences. I used the provided hyperparameters in the FAQ of the documentation. http://opennmt.net/OpenNMT-py/FAQ.html
I have trained it for 1 million steps but it still does not converge. Right now the validation acc is around 50.
I am trying to figure out how to make it converge faster. Any suggestion is appreciated. Thanks for your help!
I am attaching my training curve below: