I am using openNMT-py to train de-en translation. I use default model setting and trained with adam. Data is wmt15 data from the tutorial and I am using half of them as a trail training.
the first two epoch train, valid perplexity is listed below. It seems to be far from the result listed in tutorial, which is already below 20 after first epoch.
Train perplexity: 242.574
Train accuracy: 25.2783
Validation perplexity: 261.723
Validation accuracy: 24.7807
Train perplexity: 208.514
Train accuracy: 26.8968
Validation perplexity: 263.776
Validation accuracy: 24.923
So here is my setup:
I use default setting except using Adam with learning rate 0.001.
I am translating from German to English.
About data, I first lower them then tokened with NLTK punktToken for Germany and default word_tokenrizer for English.
I am wondering if there is any mistake in my configuration? why the perplexity deceases so slow and the gap between train ppl and valid ppl is not small?