Hello everyone. I have a problem while I was training a Chinese-English model.
Dataset: I collected a dataset from UM-corpus whcih contain 1 million of parallel corpus for training, 10K lines for validating, 10K lines for testing. All of Chinese dataset was tokenized by Jieba and Moses. And all of English dataset was tokenized by Moses, words of Chinese and English dataset was separated by space. Furthermore, for both Chinese and English parallel corpus, I used Moses to normalize the punctuation, cleaned sentences which have more than 50 words, and used tuecase model on English dataset. All of these procedure was implemented on both training, validating, and testing dataset.
Training: While I was training the model, the accuracy increased fast at the begining. However, after 10K steps, the accuracy float at 50%, it stoped going up. Here are some hyperparameters that I used.
-batch_size 64
-valid_batch_size 16
-learning_rate 1
-learning_rate_decay 0.5
Is any friend could help me and give me some advices? I will be so appreciate for that.