Error when using encoder cnn and decoder cnn in opennmt pytorch

aaronsmith · February 25, 2019, 9:07am

I tried to run fully convolutional image-to-text translation in OPENNMT-pytorch and I followed the steps given in the tutorial http://opennmt.net/OpenNMT-py/im2text.html.
The steps I followed are:
preprocess
sudo python3 preprocess.py -data_type img -src_dir images/ -train_src src-train.txt -train_tgt tgt-train.txt -valid_src src-valid.txt -valid_tgt tgt-valid.txt -save_data save3/save3 -image_channel_size 3

training:
python3 train.py -model_type img -data save/save -save_model save/ -gpu_ranks 0 -batch_size 10 -learning_rate 0.003 -encoder_type cnn -decoder_type cnn -image_channel_size 3

I tried with channel size 1 also.

But, I always get this error:

[2019-02-25 14:29:16,291 INFO] * tgt vocab size = 9535
[2019-02-25 14:29:16,293 INFO] Building model…
[2019-02-25 14:29:19,032 INFO] NMTModel(
(encoder): ImageEncoder(
(layer1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(layer2): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(layer3): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(layer4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(layer5): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(layer6): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(batch_norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(batch_norm2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(batch_norm3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(rnn): LSTM(512, 500, num_layers=2, dropout=0.3)
(pos_lut): Embedding(1000, 512)
)
(decoder): CNNDecoder(
(embeddings): Embeddings(
(make_embedding): Sequential(
(emb_luts): Elementwise(
(0): Embedding(9535, 500, padding_idx=1)
)
)
)
(linear): Linear(in_features=500, out_features=500, bias=True)
(conv_layers): ModuleList(
(0): GatedConv(
(conv): WeightNormConv2d(500, 1000, kernel_size=(3, 1), stride=(1, 1))
(dropout): Dropout(p=0.3)
)
(1): GatedConv(
(conv): WeightNormConv2d(500, 1000, kernel_size=(3, 1), stride=(1, 1))
(dropout): Dropout(p=0.3)
)
)
(attn_layers): ModuleList(
(0): ConvMultiStepAttention(
(linear_in): Linear(in_features=500, out_features=500, bias=True)
)
(1): ConvMultiStepAttention(
(linear_in): Linear(in_features=500, out_features=500, bias=True)
)
)
)
(generator): Sequential(
(0): Linear(in_features=500, out_features=9535, bias=True)
(1): Cast()
(2): LogSoftmax()
)
)
[2019-02-25 14:29:19,032 INFO] encoder: 9046272
[2019-02-25 14:29:19,032 INFO] decoder: 13300035
[2019-02-25 14:29:19,032 INFO] * number of parameters: 22346307
[2019-02-25 14:29:19,033 INFO] Starting training on GPU: [0]
[2019-02-25 14:29:19,033 INFO] Start training loop and validate every 10000 steps…
[2019-02-25 14:29:39,207 INFO] Loading dataset from save/save.train.0.pt, number of examples: 7194
/home/abhishek/.local/lib/python3.5/site-packages/torchtext/data/field.py:359: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
var = torch.tensor(arr, dtype=self.dtype, device=device)
Traceback (most recent call last):
File “train.py”, line 109, in
main(opt)
File “train.py”, line 39, in main
single_main(opt, 0)
File “/home/abhishek/openNMT/OpenNMT-py-master/onmt/train_single.py”, line 116, in main
valid_steps=opt.valid_steps)
File “/home/abhishek/openNMT/OpenNMT-py-master/onmt/trainer.py”, line 209, in train
report_stats)
File “/home/abhishek/openNMT/OpenNMT-py-master/onmt/trainer.py”, line 318, in _gradient_accumulation
outputs, attns = self.model(src, tgt, src_lengths, bptt=bptt)
File “/home/abhishek/.local/lib/python3.5/site-packages/torch/nn/modules/module.py”, line 489, in call
result = self.forward(*input, **kwargs)
File “/home/abhishek/openNMT/OpenNMT-py-master/onmt/models/model.py”, line 44, in forward
self.decoder.init_state(src, memory_bank, enc_state)
File “/home/abhishek/openNMT/OpenNMT-py-master/onmt/decoders/cnn_decoder.py”, line 66, in init_state
self.state[“src”] = (memory_bank + enc_hidden) * SCALE_WEIGHT
TypeError: add(): argument ‘other’ (position 1) must be Tensor, not tuple

aaronsmith · February 27, 2019, 5:57am

can anybody please comment on this as to wht i have to change in paramaters to make it work or is it a backend issue. any help would be deeply appreciated.

guillaumekln · February 27, 2019, 8:46am

Follow this issue for future updates:

guillaumekln · February 27, 2019, 8:46am