Continuing training on a monotext language model

(Brian Albertalli) #1

I’ve encountered an issue when trying to continue training a language model on a monotext dataset. Using the following command:
th train.lua -data data/lm-train.t7 -model_type lm -train_from data/lm_model_epoch5_96.59.t7’ -continue -save_model data/lm_model

I would get the following error:
train.lua:263: attempt to index field ‘tgt’ (a nil value)

Digging through train.lua, this line checks the dictionary size for both the source dictionary and the target dictionary. Obviously, in lm mode, there is no target dictionary, so this line throws an error.

Is this a bug in train.lua? Or is there a command line option that I’m missing for language models?

P.S. Since the line in question is simply checking to see if an error message should be logged, commenting out lines 263 and 264 seems to fix it (assuming you’re using the same dictionary).

(Guillaume Klein) #2


There is an open issue for that, you should watch it:

(Brian Albertalli) #3

Thanks! I’ll watch things over at GitHub.

(jean.senellart) #4

this was closed by