How to get flatten parameters, gradient of Encoder & Decoder? (Encoder:getParameters() work?)

Hello.
I am trying to load Encoder & Decoder separately and train model with ‘optim’ package. Assume Encoder & Decoder is pre-trained by OpenNMT’s train.lua script.

Here is my simplified code

Part1. load & initialize model.
local checkpoint = torch.load(path_to_pretrained_model)
my_encoder = onmt.Factory.loadEncoder(checkpoint.models.encoder) – brnn
my_decoder = onmt.Factory.loadDecoder(checkpoint.models.decoder)
onmt.utils.Cuda.convert(my_encoder)
onmt.utils.Cuda.convert(my_decoder)
my_encoder:training()
my_decoder:training()
params_e, grad_e = my_encoder:getParameters()
params_d, grad_d = my_decoder:getParameters()

Part2. Update Encoder & Decoder with optim package
loss, _ = optim.adam(feval_decoder, params_d, optim_state_decoder)
_, _ = optim.adam(feval_encoder, params_e, optim_state_encoder)

The problem is gradient norm of encoder (i.e. grad_e:norm()) is always zero.

The one thing I am not confident is whether I can use my_encoder:getParameters() to retrieve all the parameters and gradient tensors exist in Encoder. It seems that Model:initParams() utilizes :getParameters() function of each ‘modules’, but not applying it for whole Encoder or Decoder.

I really appreciate any comments.
Thank you :slight_smile:

Hi,

I confirm that calling getParameters on the Encoder or Decoder returns the parameters of each submodule. That is what Model:initParams is doing.

1 Like