Hi all,
I want to continue the training on the pre-trained model “onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7”.
I downloaded the pre-trained English->German model from http://opennmt.net/Models/ and additionally the WMT15 training and evaluation data which I preprocessed with the preprocess.lua script.
Then I call the following command:
th train.lua -gpuid 1 -data wmt15-all-en-de-train.t7 -save_model demo_model -train_from onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7 -continue
First everything looks great but then I get the following error message:
[05/03/17 14:30:20 INFO] Resuming training from epoch 14 at iteration 1…
[05/03/17 14:30:23 INFO] Preparing memory optimization…
…rly/torch/install/share/lua/5.1/onmt/modules/Decoder.lua:443: attempt to index field ‘inputIndex’ (a nil value)
stack traceback:
…rly/torch/install/share/lua/5.1/onmt/modules/Decoder.lua:443: in function ‘backward’
/home/icklerly/torch/install/share/lua/5.1/onmt/Seq2Seq.lua:213: in function ‘trainNetwork’
…klerly/torch/install/share/lua/5.1/onmt/utils/Memory.lua:40: in function ‘optimize’
…lerly/torch/install/share/lua/5.1/onmt/train/Trainer.lua:94: in function ‘__init’
/home/icklerly/torch/install/share/lua/5.1/torch/init.lua:91: in function 'new’
train.lua:172: in function 'main’
train.lua:178: in main chunk
[C]: in function ‘dofile’
…erly/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
Any ideas what this means?
Is it possible to resume the training for your published model?