Training LM with monolingual corpora

Hello everyone
I want to train a LM but I get an error and I don’t understand what does it means.

First I preprocessed the corpora with this command
sudo nvidia-docker run -v $PWD/opennmt_data/:/home/data -d claudia_opennmt th preprocess.lua -data_type monotext -train /home/data/ -valid /home/data/tgt-val-twe-nl.txt -save_data /home/data/datalm

This is the command I am running for training
sudo nvidia-docker run -v $PWD/opennmt_data/:/home/data -d claudia_opennmt th train.lua -model_type lm -data /home/data/datalm-train.t7 -save_model /home/data/demo-lm

And this is the error I get

Can anyone help me?


This is an open issue, see:

I see. Thanks!


Is there a way of doing the same (training a neural language model using monolingual data) with the tensor flow implementation of OpenNMT? I’ve been reading the documentation but I couldn’t find anything similar.



No there is no such feature.

Oh I see, thanks!