Hi, I have trained a transformer model for NMT with the following options:
-layers 6
-src_word_vec_size 512
-tgt_word_vec_size 512
-encoder_type transformer
-decoder_type transformer
-heads 8
-transformer_ff 2048
-position_encoding
-rnn_size 512
-max_generator_batches 2
-gpu_ranks 0 1
-world_size 2
-batch_size 2048
-batch_type tokens
-normalization tokens
-accum_count 2
-optim adam
-adam_beta2 0.998
-decay_method noam
-warmup_steps 8000
-learning_rate 2
-max_grad_norm 0
-param_init 0
-param_init_glorot
-label_smoothing 0.1
-valid_steps 100000
-save_checkpoint_steps 100000
-train_steps 3100000000
-dropout 0.3
-learning_rate_decay .5
-start_decay_steps 3042444
-decay_steps 3042444
but when I want to use the model for translation with the following options:
-beam_size 3
-max_length 125
-batch_size 30
-n_best 1
-replace_unk
-gpu 1
I got this error:
AttributeError: 'TransformerDecoder' object has no attribute 'self_attn'
Do you have any idea?