I’m new to this software. I use OpenNMT py for machine translation tasks. I use different datasets, but the data quantity is similar, but the model parameters are the same.
For the model trained by one dataset, the xent parameter can be reduced to 0, but for the model trained by the other dataset, the xent parameter can only be reduced to 0.4. How do you adjust the parameter so that the xent can achieve results similar to the first model.
Can the learning rate of the second model be reduced from 2 to 1?
Thank you!
Dear Willow,
I am not sure I am following here; just two things to double-check:
- Please make sure you are using the latest version of OpenNMT-py. Most likely, you will end up working with a YAML configuration file.
- Adjusting parameters should follow a paper (at least at the beginning). Most likely, this would be Attention is all you need (see Table 4, for example).
- Adjusting parameters would not necessarily give better results. Data quantity, quality, and distribution are critical factors.
Maybe you already know this, but just a reminder. We usually split our dataset into 3 portions:
- training dataset - used for training the model;
- development dataset - used to run regular validations during the training to help improve the model parameters; and
- test dataset - a holdout dataset used after the model finishes training to finally evaluate the model on unseen data.
Then, we use sacreBLEU, COMET, or other evaluation metrics to compare the quality of MT output to the reference test dataset.
I hope this helps.
Kind regards,
Yasmin
1 Like
Dear Yasmin,
Thank you so much for the advice you gave me.
I use openNMT-py 0.4. I guess the reason is that the quality of the two datasets is different. First, I reduced the learning rate and increased the number of training steps, but the effect was not very obvious.Then I will make some other attempts!
Thank you very much for your reply.
Kind regards,
Willow