Low accuracy of chinese-English model

LinyuZhang · May 28, 2020, 12:05pm

Hello everyone. I have a problem while I was training a Chinese-English model.

Dataset: I collected a dataset from UM-corpus whcih contain 1 million of parallel corpus for training, 10K lines for validating, 10K lines for testing. All of Chinese dataset was tokenized by Jieba and Moses. And all of English dataset was tokenized by Moses, words of Chinese and English dataset was separated by space. Furthermore, for both Chinese and English parallel corpus, I used Moses to normalize the punctuation, cleaned sentences which have more than 50 words, and used tuecase model on English dataset. All of these procedure was implemented on both training, validating, and testing dataset.

Training: While I was training the model, the accuracy increased fast at the begining. However, after 10K steps, the accuracy float at 50%, it stoped going up. Here are some hyperparameters that I used.
-batch_size 64
-valid_batch_size 16
-learning_rate 1
-learning_rate_decay 0.5

Is any friend could help me and give me some advices? I will be so appreciate for that.

guillaumekln · June 18, 2020, 8:52am

Hi,

Accuracy is one thing but what results did you get with a more appropriate metric like BLEU?

You also want to look at training a Transformer model: https://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-the-transformer-model