Understanding the number of parameters


(Wen Tsai) #1

Hi,

These days I’m trying to figure out where do the number of parameters come from, but I can’t get the right numbers as shown in the number of parameters field.

I know it’s a very basic question, but I really wanna understand how do parameters combine together.

This is my model, using word_vec_size=50 and rnn_size=128.

NMTModel(
  (encoder): RNNEncoder(
    (embeddings): Embeddings(
      (make_embedding): Sequential(
        (emb_luts): Elementwise(
          (0): Embedding(50004, 50, padding_idx=1)
        )
      )
    )
    (rnn): LSTM(50, 64, dropout=0.3, bidirectional=True)
  )
  (decoder): InputFeedRNNDecoder(
    (embeddings): Embeddings(
      (make_embedding): Sequential(
        (emb_luts): Elementwise(
          (0): Embedding(50004, 50, padding_idx=1)
        )
      )
    )
    (dropout): Dropout(p=0.3)
    (rnn): StackedLSTM(
      (dropout): Dropout(p=0.3)
      (layers): ModuleList(
        (0): LSTMCell(178, 128)
      )
    )
    (attn): GlobalAttention(
      (linear_context): Linear(in_features=128, out_features=128, bias=False)
      (linear_query): Linear(in_features=128, out_features=128, bias=True)
      (v): Linear(in_features=128, out_features=1, bias=False)
      (linear_out): Linear(in_features=256, out_features=128, bias=True)
      (softmax): Softmax()
      (tanh): Tanh()
    )
    (copy_attn): GlobalAttention(
      (linear_context): Linear(in_features=128, out_features=128, bias=False)
      (linear_query): Linear(in_features=128, out_features=128, bias=True)
      (v): Linear(in_features=128, out_features=1, bias=False)
      (linear_out): Linear(in_features=256, out_features=128, bias=True)
      (softmax): Softmax()
      (tanh): Tanh()
    )
  )
  (generator): CopyGenerator(
    (linear): Linear(in_features=128, out_features=50004, bias=True)
    (linear_copy): Linear(in_features=128, out_features=1, bias=True)
  )
)
* number of parameters: 11799973
encoder: 2559592
decoder: 9240381

I wanna know how to calculate the encoder: 2559592 and decoder: 9240381.

Here goes other training settings:

 -layers 1
 -global_attention mlp
 -encoder_type brnn
 -rnn_size 128
 -word_vec_size 50
 -copy_attn

Thanks!