OpenNMT Forum

How to resume training from last interupted state

I was training a translation model, but due to a power failure system eventually restarted. The last checkpoint I have saved is on 90K training step. Is there any way to resume training from that same state??

If you train by OpenNMT-py

–train_from, -train_from

If training from a checkpoint then this is the path to the pretrained model’s state_dict.

Default: “”

Dear @park, Thanks for your reply. But I am getting an error “[Errno 2] No such file or directory: ‘’”
My saved model is in the main directory, with other files like, are.(Default location where OpenNMT saves them).
I just added -train_from to my command.
What correction should I made to make this work??
Many thanks in advance.

You need to specify your model PATH

-train_from YOUR/MODEL/PATH/

Please add the PATH

@park Sorry, but I am still not able to resolve it. I shifted file in the data folder, with other trains, test, valid data files. Now I added -train_from data/ to my command, But I am still getting error file not found. Can you please help a bit. this is the name of file and available in the data folder.
What should I add to my command??

Please give me the full command

@park Here is the full command I am using

CUDA_VISIBLE_DEVICES=0 python -src_word_vec_size 200 -tgt_word_vec_size 200 -data data/model -save_model sum_eng-model -save_checkpoint_steps 100 -world_size 1 -gpu_ranks 0 -batch_size 64 -valid_steps 10000 -train_steps 100000 -report_every 50 -train_from data/


CUDA_VISIBLE_DEVICES=0 python3 -src_word_vec_size 200 -tgt_word_vec_size 200 -data data/model -save_model sum_eng-model -save_checkpoint_steps 100 -world_size 1 -gpu_ranks 0 -batch_size 64 -valid_steps 10000 -train_steps 100000 -report_every 50 --train_from data/

@park still same error [Errno 2] No such file or directory: ‘data/’.
File is there and file name is correct too.


python3 build
python3 install

@perk Thanks for your effort. This was a dependencies Issue.

Okay, Nice work