Best arguments to train zh-en model

woolz · July 19, 2019, 7:40pm

Hello, I’m newbie in Neural Machine Translation world,

I have one question…

for train 1M parallel zh-en corpus with zh corpus tokenized with jiagu

What is the more recommended configuration?

Current i’m use the default train arguments:
python train.py -data data/demo -save_model demo-model

Ravneet · July 19, 2019, 7:55pm

@woolz, Using OpenNMT you can play with different configurations. Currently, you are using Sequence to Sequence model with very minimal configuration. Just single Layer Encode-Decoder.
For some better results, try with Transform model, script with the exact configuration that Google used to produce State of art results are given in the FAQ section.

Happy Coding !!