Continue Training on pre-trained WMT15 model

icklerly · May 3, 2017, 12:43pm

Hi all,

I want to continue the training on the pre-trained model “onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7”.
I downloaded the pre-trained English->German model from http://opennmt.net/Models/ and additionally the WMT15 training and evaluation data which I preprocessed with the preprocess.lua script.
Then I call the following command:

th train.lua -gpuid 1 -data wmt15-all-en-de-train.t7 -save_model demo_model -train_from onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7 -continue

First everything looks great but then I get the following error message:
[05/03/17 14:30:20 INFO] Resuming training from epoch 14 at iteration 1…
[05/03/17 14:30:23 INFO] Preparing memory optimization…
…rly/torch/install/share/lua/5.1/onmt/modules/Decoder.lua:443: attempt to index field ‘inputIndex’ (a nil value)
stack traceback:
…rly/torch/install/share/lua/5.1/onmt/modules/Decoder.lua:443: in function ‘backward’
/home/icklerly/torch/install/share/lua/5.1/onmt/Seq2Seq.lua:213: in function ‘trainNetwork’
…klerly/torch/install/share/lua/5.1/onmt/utils/Memory.lua:40: in function ‘optimize’
…lerly/torch/install/share/lua/5.1/onmt/train/Trainer.lua:94: in function ‘__init’
/home/icklerly/torch/install/share/lua/5.1/torch/init.lua:91: in function 'new’
train.lua:172: in function 'main’
train.lua:178: in main chunk
[C]: in function ‘dofile’
…erly/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

Any ideas what this means?
Is it possible to resume the training for your published model?

guillaumekln · May 3, 2017, 12:51pm

Hi,

No, published models only work for translation. However, if you have a GPU it should not be too long to replicate their results.

icklerly · May 3, 2017, 12:54pm

ok thank you!
I will train it myself

MeeraRajendrababu · March 19, 2018, 6:20am

Hi,

I am also stuck with same error. I am doing document summarization . i have downloaded the trained model . Now i would like to retrain the model with new dataset. but the same error occurred . How to overcome this ?