OpenNMT Forum

Training fails while getting batch

opennmt-lua

#1

After training for several hours and successfully saving about 5 checkpoints, training errors out with a message:

**./onmt/data/Dataset.lua:102: attempt to index a nil value** 
./onmt/data/Dataset.lua:102: in function 'getBatch'
./onmt/train/Trainer.lua:277: in function 'trainEpoc'
./onmt/train/Trainer.lua:484: in function 'train'

This appears to be where it is setting the start of the batch range. A little stuck on what may be the cause. Could it have to do with changing the batch_size when resuming training from a checkpoint?


(Guillaume Klein) #2

What batch size did you set?