Low accuracy in the top predictions translation


I am getting a good translation accuracy of about 82%, but this when I use the n-best 100 and beam 100, but the top-1, top2, …, top5 accuracies are very low.
My questions are:

  1. why?
  2. how can I push this accuracy to the top of my predictions.
    any suggestions would be appreciated.


I am not sure if you mean 100 or 10, but if is that big, most likely you are working on too small data, that resulted in a too small model.

Data size and quality is a crucial factor for NMT output quality.

It is safe to start with parameters recommend.

You can find many public datasets on OPUS.

You can refer to this tutorial and then build on it.

All the best,

Thank you very much Yasmin.
My training dataset size is 76184 samples. I can’t add more data, it is a specific domain. Do you think that because of early_stopping criteria, the model is not learning enough? I am asking because most of the models in the literature did not use this criteria and they were trained till the end of train_steps of 250000 at least.

This is a small dataset. In this case, domain adaptation is better achieved via fine-tuning of a larger baseline NMT model. Here are a few notes:

  1. Among the famous approaches to domain adaptation is Mixed Fine-Tuning (Chu et al., 2017). I tried to explain how to implement it with OpenNMT-py in this article.

  2. Data augmentation techniques like Back Translation of monolingual target data can also be utilized to increase the size of the in-domain data, as in Improving Neural Machine Translation Models with Monolingual Data (Sennrich et al., 2016) and Tagged Back-Translation (Caswell et al., 2019). This article elaborates on some technical details.

  3. Finally, it turns out that we can even generate purely synthetic data (both the source and target), based on the small in-domain data available, as in Domain-Specific Text Generation for Machine Translation (Moslem et al., 2022). You need to first experiment with the two previous approaches, which can be applied independently or together.

Well, early stopping simply means the model is not learning (enough) anymore. However, if you increase its value or remove it, you can test the results on the two checkpoints, one with early stopping and one after further steps.

Kind regards,