Transfer learning on pre trained model

amittiwari · January 17, 2020, 12:24pm

I want to do an experiment. I trained a model for language pair srcL1-tgtL2. I want to train another model on top of pre-trained model srcL1-tgtL2 for another language pair srcL3-tgtL2.
In both language pairs, target language is same. For this experiment, I want to freeze the word embeddings of Encoder side.
How to freeze Word embeddings of Encoder or Decoder side?

francoishernandez · January 17, 2020, 1:47pm

This case is not handled for now, you’d have to patch it in the code.
You may try to set requires_grad to False for the parameters you’d like to freeze, but you might also need to adapt a few things around the optimizer.
See for instance: https://discuss.pytorch.org/t/how-the-pytorch-freeze-network-in-some-layers-only-the-rest-of-the-training/7088/3

akshara · July 13, 2021, 3:21pm

Hello François,
I am new to programming and neural networks and still learning about them so the following question could be very silly so please bear with me.

I tried to do the above with a model generated by OpenNMT and I get an error stating that - AttributeError: ‘dict’ object has no attribute ‘children’

I get it when I ran the following python script-
import torch
model_ft = torch.load(’/home/akandimalla/practicum/te_en_models/s4/run4/model_step_9000.pt’)
ct = 0
for child in model_ft.children():
ct += 1
if ct < 7:
for param in child.parameters():
param.requires_grad = False
torch.save(model_ft, ‘/home/akandimalla/practicum/te_en_models/s4/run4/model_step_frozen_9000.pt’)

I am also unable to use .summary(), and other similar functions.

ymoslem · July 13, 2021, 5:09pm

Dear Akshara,

How to freeze Word embeddings of Encoder or Decoder side?

If you want to do what the original poster asked about, now OpenNMT-py has two training parameters freeze_word_vecs_enc and freeze_word_vecs_dec. These can be used out of the box during the training.

If you want to use OpenNMT-py modules, e.g. onmt.modules.Embeddings, you can set freeze_word_vecs to True.

Also, OpenNMT-tf has an even broader option to freeze layers.

Kind regards,
Yasmin