Use previous checkpoint

Hi,

I have trained my Transformers model until 140.000 steps. So now my saved models are from steps 138.398, 138.627, 139.008, 139.319, 139.550, 139.781, and 140.000. Is there any way where I can do “onmt-main infer” from step 110.000? Because after I check the tensorboard, my model’s best performance was on step 110.000. Thank you

Hi, look for the command line option --checkpoint_path.

Yes, I understand we can do that.

But in my case, I simply could not because the file is of mode.ckpt-110000 is not exist anymore.
Here is my checpoint file

image

Ah right so there is no way to use checkpoints that don’t exist. Next training you can increase the keep_checkpoint_max parameter in the train section of the configuration.