Hello, I am really confused with my translation result from my model.
I am trying to reproduced the result from the papar:Neural Responding Machine for Short-Text Conversation
And I tried with the weibo data refered in the paper, I tried first with a two-layer model,with config like this:
rnn_size = 1000
word_vec_size = 620
rnn_type = GRU
It seems that all the translation prediction is all the same ,so similar ,and of course this is not what i want it .
It seems that this kind of thing always happens when training sequence to sequence model,
Anyone saw this kind of result before? Please help! Thanks!
there are 4430000 sentence in the dataset ,and final perplexity is 483.14, I am runing to epoch 11, and learning rate is 0.0039 now ,more training seems not helping reduce the perlexity.
My config is here:
rnn_size = 1000
word_vec_size = 620
rnn_type = GRU
and the other options are default
I wonder if this model is not well trained, but this is already the 11th epch ,and learning rate is so small now.
sure , what i am trying to do is to build a chinese short converstation model(from post to response),so my examples are chinese,tokenized, and splited by blank space.
Of course, since I’m not reading chinese, I used a translator to have an idea of them. Is there really some kind of logic between source and target sentences ? With the translated ones, I find it really hard…
In this paper ,they do build such a model and achieve a state of art results. What i am doing now is trying to reproduce the result from this paper,and the dataset is offered by the author in the github.