I did contact the authors and they say that they have got the reported results using the same configuration that I am currently using with OpenNMT. But still somehow I get the below type of output for test data.
SOS in 2010 , beyonc was featured on lady gaga 's single
telephone ’ ’ and its music video . EOS
SOS the song topped the us pop songs chart , becoming the sixth number-one for both beyonc and gaga , tying them with mariah carey for most number-ones since the nielsen top 40 airplay chart launched in 1992 . EOS
SOS what was the name of the first person that madonna was a part of ? EOS
SOS what was the name of the book that madonna was a member of ? EOS
SOS and EOS are start and end of sentences.
Can you suggest why the model’s state is getting fixed and the output being generated is always almost same.
Little bit about the architecture that I am using.
2 layers for each encoder and decoder with hidden state size 600. I’ve been using SGD, pre-trained embeddings with fixed learning and global attention model. I am training the whole for 12-15 epochs. Below is the configuration in detail.
th train.lua -data data/qg-train.t7 -save_model model -rnn_size 600 -layers 2 -optim sgd -learning_rate 1 -learning_rate_decay 0.5 -start_decay_at 8 -max_batch_size 64 -dropout 0.3 -start_epoch 1 -end_epoch 12 -max_grad_norm 5 -word_vec_size 300 -pre_word_vecs_enc data/qg-src-emb-embeddings-300.t7 -pre_word_vecs_dec data/qg-tgt-emb-embeddings-300.t7 -fix_word_vecs_enc 1 -fix_word_vecs_dec 1 -gpuid 1 -brnn 1 -attention global -rnn_type LSTM -input_feed 1 -global_attention dot