Retraining a pretrained model in OpenNMT-py

jmugan · October 28, 2019, 10:25pm

Hi,

I’d like to refine (retrain) the 2-layer LSTM with copy attention for summarization in OpenNMT-py. I’m unclear if I have to start training over. I’ve seen this, http://opennmt.net/OpenNMT/training/retraining/ but is that only for Lua? I’m new to PyTorch. I’ve downloaded the model, gigaword_copy_acc_51.78_ppl_11.71_e20.pt, how do you get the vocabulary out to do your model training? I don’t need to change the model architecture or the vocabulary; I’m just not sure about the steps. I don’t have easy access to a GPU.

Thanks so much,
Jonathan

francoishernandez · October 29, 2019, 8:41am

Hi,
You can load and have a look inside it without a GPU.

import torch
checkpoint = torch.load(<model_file.pt>, map_location="cpu")

This is a simple dictionary, and the vocab is stored under the key 'vocab'.
So, you can dump it in a separate file: torch.save(checkpoint['vocab'], <vocab_file.pt>).
Once you have this vocab file, you can preprocess your data with preprocess.py / onmt_preprocess passing this vocab file to the -src_vocab opt.
Finally, you can use -train_from in the train.py / onmt_train command to start from the pretrained checkpoint.

jmugan · October 29, 2019, 1:50pm

Wonderful! Thank you!