Language Model Accuracy

ymoslem · August 14, 2020, 3:05pm

Hi Colleagues!

It is a general question that is not necessarily related to OpenNMT, but I will highly appreciate if experts here can share their experience with it. I am training a bidirectional LSTM language model for text generation with 4 layers other than the embedding layer. I tried several options inside it; what helped most increase the training accuracy was reducing the sequence max length, and maybe reducing/removing regularization and dropout.

Now, the training accuracy is moving very slowly after 25%. My questions is: for translation tasks, we usually expect something above 70%; what is usually the expected accuracy for a Language Model?

Many thanks!
Yasmin

ymoslem · August 21, 2020, 7:23am

So after some trials; the following helped a bit:
1- Enriching the data from News Commentary with a portion of MultiUN.
2- Reducing the maximum sequence length.
3- Continuing training with a reduced value of the Learning Rate.
4- Running tests on the real problem. Accuracy for a Language Model is about how successful the model is in generating the exact next word, and usually this is not the purpose of creating a Language Model as variations might be accepted as well.