Parameters tuning for translation Italian - English

AndreaM · April 24, 2019, 8:53am

Hello, I would like to train a system to translate from Italian to English. At the moment I have a parallel corpus made by 2.300.000 sentences by many corpus (Tatoeba, Europarlament, …). I am training with CPU (1300 tokens/second). I am training with the default parameters, and the accuracy stop the increasing at 58% after 40000 steps. Do I need more sentences for the training? Any tips? Since the training is very long, I would like to hear some suggestions in order to try with the best parameter setting as possible. Thanks

guillaumekln · April 28, 2019, 8:46pm

Hi,

You should probably train a bigger model than the one used with the default parameters. See for example:

http://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-the-transformer-model-do-you-support-multi-gpu

However, this will be somewhat difficult to achieve on a CPU.

AndreaM · April 29, 2019, 1:14am

perfect. Thank you. I post another question (more code related) in a new topic. Hope you can help me again. Thanks.