I am training a translation model using the Transformer model as shown in the FAQs (http://opennmt.net/OpenNMT-py/FAQ.html#how-do-i-use-the-transformer-model-do-you-support-multi-gpu). I want to use also embeddings, and I use the ones provided from Fastext (size 300). I try to run the training with the following command:
`python train.py -data data/demo -save_model demo-model2 -layers 6 -rnn_size 248 -word_vec_size 100 -transformer_ff 2048 -heads 8 -encoder_type transformer -decoder_type transformer -position_encoding -train_steps 100000 -max_generator_batches 2 -dropout 0.1 -batch_size 4096 -batch_type tokens -normalization tokens -accum_count 2 -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 -max_grad_norm 0 -param_init 0 -param_init_glorot -label_smoothing 0.1 -valid_steps 5000 -save_checkpoint_steps 5000 -pre_word_vecs_enc data/embeddings.enc.pt -pre_word_vecs_dec data/embeddings.dec.pt -world_size 1 -gpu_ranks 0`
I get the following error:
[2019-04-29 01:04:34,484 INFO] encoder: 47598120
[2019-04-29 01:04:34,485 INFO] decoder: 69005140
[2019-04-29 01:04:34,485 INFO] * number of parameters: 116603260
[2019-04-29 01:04:34,490 INFO] Starting training on GPU: [0]
[2019-04-29 01:04:34,490 INFO] Start training loop and validate every 5000 steps…
[2019-04-29 01:04:52,583 INFO] Loading dataset from data/demo.train.0.pt, number of examples: 940723
Traceback (most recent call last):
File “train.py”, line 109, in
main(opt)
File “train.py”, line 39, in main
single_main(opt, 0)
File “/content/gdrive/My Drive/OpenNMT/onmt/train_single.py”, line 118, in main
valid_steps=opt.valid_steps)
File “/content/gdrive/My Drive/OpenNMT/onmt/trainer.py”, line 233, in train
report_stats)
File “/content/gdrive/My Drive/OpenNMT/onmt/trainer.py”, line 348, in _gradient_accumulation
outputs, attns = self.model(src, tgt, src_lengths, bptt=bptt)
File “/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py”, line 489, in call
result = self.forward(*input, **kwargs)
File “/content/gdrive/My Drive/OpenNMT/onmt/models/model.py”, line 42, in forward
enc_state, memory_bank, lengths = self.encoder(src, lengths)
File “/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py”, line 489, in call
result = self.forward(*input, **kwargs)
File “/content/gdrive/My Drive/OpenNMT/onmt/encoders/transformer.py”, line 122, in forward
out = layer(out, mask)
File “/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py”, line 489, in call
result = self.forward(*input, **kwargs)
File “/content/gdrive/My Drive/OpenNMT/onmt/encoders/transformer.py”, line 47, in forward
input_norm = self.layer_norm(inputs)
File “/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py”, line 489, in call
result = self.forward(input, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/torch/nn/modules/normalization.py”, line 158, in forward
input, self.normalized_shape, self.weight, self.bias, self.eps)
File “/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py”, line 1651, in layer_norm
torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[248], expected input with shape [, 248], but got input of size[99, 39, 500]
Any idea about the reasons?
I built the embedding files using the following command:
./tools/embeddings_to_torch.py -emb_file_enc embedding/fastext/cc.it.300.vec -emb_file_dec embedding/fastext/cc.en.300.vec -dict_file data/demo.vocab.pt -output_file data/embeddings