It is a general question that is not necessarily related to OpenNMT, but I will highly appreciate if experts here can share their experience with it. I am training a bidirectional LSTM language model for text generation with 4 layers other than the embedding layer. I tried several options inside it; what helped most increase the training accuracy was reducing the sequence max length, and maybe reducing/removing regularization and dropout.
Now, the training accuracy is moving very slowly after 25%. My questions is: for translation tasks, we usually expect something above 70%; what is usually the expected accuracy for a Language Model?