Continuing training on a monotext language model

balbertalli · January 24, 2018, 8:32pm

I’ve encountered an issue when trying to continue training a language model on a monotext dataset. Using the following command:
th train.lua -data data/lm-train.t7 -model_type lm -train_from data/lm_model_epoch5_96.59.t7’ -continue -save_model data/lm_model

I would get the following error:
train.lua:263: attempt to index field ‘tgt’ (a nil value)

Digging through train.lua, this line checks the dictionary size for both the source dictionary and the target dictionary. Obviously, in lm mode, there is no target dictionary, so this line throws an error.

Is this a bug in train.lua? Or is there a command line option that I’m missing for language models?

P.S. Since the line in question is simply checking to see if an error message should be logged, commenting out lines 263 and 264 seems to fix it (assuming you’re using the same dictionary).

guillaumekln · January 25, 2018, 8:27am

Hello,

There is an open issue for that, you should watch it:

balbertalli · January 25, 2018, 12:21pm

Thanks! I’ll watch things over at GitHub.

jean.senellart · January 30, 2018, 12:46pm

this was closed by https://github.com/OpenNMT/OpenNMT/pull/502.