Transfer learning on pre trained model

I want to do an experiment. I trained a model for language pair srcL1-tgtL2. I want to train another model on top of pre-trained model srcL1-tgtL2 for another language pair srcL3-tgtL2.
In both language pairs, target language is same. For this experiment, I want to freeze the word embeddings of Encoder side.
How to freeze Word embeddings of Encoder or Decoder side?

This case is not handled for now, you’d have to patch it in the code.
You may try to set requires_grad to False for the parameters you’d like to freeze, but you might also need to adapt a few things around the optimizer.
See for instance:

Hello François,
I am new to programming and neural networks and still learning about them so the following question could be very silly so please bear with me.

I tried to do the above with a model generated by OpenNMT and I get an error stating that - AttributeError: ‘dict’ object has no attribute ‘children’

I get it when I ran the following python script-
import torch
model_ft = torch.load(’/home/akandimalla/practicum/te_en_models/s4/run4/’)
ct = 0
for child in model_ft.children():
ct += 1
if ct < 7:
for param in child.parameters():
param.requires_grad = False, ‘/home/akandimalla/practicum/te_en_models/s4/run4/’)

I am also unable to use .summary(), and other similar functions.

Dear Akshara,

How to freeze Word embeddings of Encoder or Decoder side?

If you want to do what the original poster asked about, now OpenNMT-py has two training parameters freeze_word_vecs_enc and freeze_word_vecs_dec. These can be used out of the box during the training.

If you want to use OpenNMT-py modules, e.g. onmt.modules.Embeddings, you can set freeze_word_vecs to True.

Also, OpenNMT-tf has an even broader option to freeze layers.

Kind regards,

1 Like