Save trained models per iteration

(Negacy Hailu) #1

I am training an NMT model using OpenNMT torch version. I would like to store the trained models per x number of iterations. Right now, it can store the models per x number of epochs. This is what my script looks like: th train.lua -data data/demo-train.t7 -save_model model -save_every 5000. Any help?

(Guillaume Klein) #2

-save_every 5000 will save a checkpoint every 5000 iterations. Does that work for you?

(Negacy Hailu) #3

Is checkpoint the same as model?I would like to save the pre-trained models per 5000 iterations while the training is in progress. My data is huge, and I may not make it even to the first epoch. So, I wanted to save the models before any epoch is complete

(Guillaume Klein) #4

Yes, there are the same.

(Negacy Hailu) #5

In that case, the -save_every 5000 flag is not saving anything.

(Guillaume Klein) #6

It should produce a rolling checkpoint that ends with _checkpoint.t7.