I’ve searched this forum for incremental training with openNMT but I can’t get a clear solution. The scenario is the following:
- I trained a model for about 30 epochs on the available training data, which gave me quite good results in the translation process.
- After a while, I got a new bunch of training examples
Now I want to add this knowledge to the model avoiding to retrain the whole model again from the beginning. According to the documentation to continue with the training process I need the following command:
th train.lua -gpuid 1 -data data/demo-train.t7 -save_model demo -save_every 50 -train_from demo_checkpoint.t7 -continue
Should I just replace the -data option with my new training data? Or do I need to merge all the available data (old and new) and pass it to the model? With the first option, wouldn’t it be possible that the model forgets about previous knowledge? Should I use the -continue option?
By the way, I did BPE processing so I’m not expecting to have any problem with OOV words.