Training failure - wrong number of arguments for function call


(Terence Lewis) #1

This is my first training in v0.7 (have built around 10 models in earlier version) and training is failing with a “wrong number of arguments” message. Has anyone else encountered this? I have updated nn.

06/30/17 15:15:32 INFO] Using GPU(s): 1
[06/30/17 15:15:32 INFO] Training Sequence to Sequence with Attention model…
[06/30/17 15:15:32 INFO] Loading data from ‘/home/tel34/crazyNMT/alpha-train.t7’…
[06/30/17 15:22:09 INFO] * vocabulary size: source = 50004; target = 50004
[06/30/17 15:22:09 INFO] * additional features: source = 0; target = 0
[06/30/17 15:22:09 INFO] * maximum sequence length: source = 50; target = 51
[06/30/17 15:22:09 INFO] * number of training sentences: 9414897
[06/30/17 15:22:09 INFO] * number of batches: 147131
[06/30/17 15:22:09 INFO] - source sequence lengths: equal
[06/30/17 15:22:09 INFO] - maximum size: 64
[06/30/17 15:22:09 INFO] - average size: 63.99
[06/30/17 15:22:09 INFO] - capacity: 100.00%
[06/30/17 15:22:09 INFO] Building model…
[06/30/17 15:22:09 INFO] * Encoder:
[06/30/17 15:22:12 INFO] - word embeddings size: 500
[06/30/17 15:22:12 INFO] - type: bidirectional RNN
[06/30/17 15:22:12 INFO] - structure: cell = LSTM; layers = 4; rnn_size = 600; dropout = 0.3
[06/30/17 15:22:12 INFO] * Decoder:
[06/30/17 15:22:14 INFO] - word embeddings size: 500
[06/30/17 15:22:14 INFO] - attention: global (general)
[06/30/17 15:22:14 INFO] - structure: cell = LSTM; layers = 4; rnn_size = 600; dropout = 0.3
[06/30/17 15:22:14 INFO] * Bridge: copy
[06/30/17 15:22:17 INFO] Initializing parameters…
[06/30/17 15:22:20 INFO] * number of parameters: 116474004
[06/30/17 15:22:20 INFO] Preparing memory optimization…
/home/tel34/torch/install/bin/luajit: /home/tel34/torch/install/share/lua/5.1/nn/THNN.lua:110: wrong number of arguments for function call
stack traceback:
[C]: in function ‘v’
/home/tel34/torch/install/share/lua/5.1/nn/THNN.lua:110: in function ‘ClassNLLCriterion_updateOutput’
…l34/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:41: in function ‘updateOutput’
…l34/torch/install/share/lua/5.1/nn/ParallelCriterion.lua:23: in function ‘forward’
./onmt/modules/Decoder.lua:466: in function ‘backward’
./onmt/Seq2Seq.lua:213: in function ‘trainNetwork’
./onmt/utils/Memory.lua:40: in function ‘optimize’
./onmt/train/Trainer.lua:101: in function ‘__init’
/home/tel34/torch/install/share/lua/5.1/torch/init.lua:91: in function 'new’
train.lua:172: in function 'main’
train.lua:178: in main chunk
[C]: in function ‘dofile’
…el34/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406670


(Guillaume Klein) #2

Could you try updating cunn as well?

luarocks install nn
luarocks install cunn

(Terence Lewis) #3

Well, that failed to build. Any ideas?

Installing https://raw.githubusercontent.com/torch/rocks/master/cunn-scm-1.rockspec
Using https://raw.githubusercontent.com/torch/rocks/master/cunn-scm-1.rockspec… switching to ‘build’ mode
Cloning into ‘cunn’…
remote: Counting objects: 167, done.
remote: Compressing objects: 100% (148/148), done.
remote: Total 167 (delta 55), reused 45 (delta 17), pack-reused 0
Receiving objects: 100% (167/167), 177.85 KiB | 0 bytes/s, done.
Resolving deltas: 100% (55/55), done.
Checking connectivity… done.
cmake -E make_directory build && cd build && cmake … -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/tel34/torch/install/bin/…" -DCMAKE_INSTALL_PREFIX="/home/tel34/torch/install/lib/luarocks/rocks/cunn/scm-1" && make -j$(getconf _NPROCESSORS_ONLN) install

– The C compiler identification is GNU 4.8.4
– The CXX compiler identification is GNU 4.8.4
– Check for working C compiler: /usr/bin/cc
– Check for working C compiler: /usr/bin/cc – works
– Detecting C compiler ABI info
– Detecting C compiler ABI info - done
– Check for working CXX compiler: /usr/bin/c++
– Check for working CXX compiler: /usr/bin/c++ – works
– Detecting CXX compiler ABI info
– Detecting CXX compiler ABI info - done
– Found Torch7 in /home/tel34/torch/install
– Found CUDA: /usr/local/cuda-8.0 (found suitable version “8.0”, minimum required is “6.5”)
– Removing -DNDEBUG from compile flags
– TH_LIBRARIES: TH
– THC_LIBRARIES: THC
– Autodetected CUDA architecture(s): 6.1
– Configuring done
– Generating done
– Build files have been written to: /tmp/luarocks_cunn-scm-1-3956/cunn/build
[ 1%] [ 2%] [ 4%] [ 5%] [ 7%] [ 8%] [ 10%] [ 11%] Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_SpatialReplicationPadding.cu.o
Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_SpatialClassNLLCriterion.cu.o
Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_Sqrt.cu.o
Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_SmoothL1Criterion.cu.o
Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_SparseLinear.cu.o
Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_TemporalRowConvolution.cu.o
Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_Abs.cu.o
Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_SoftMax.cu.o
/tmp/luarocks_cunn-scm-1-3956/cunn/lib/THCUNN/generic/SparseLinear.cu(209): error: too many arguments in function call

/tmp/luarocks_cunn-scm-1-3956/cunn/lib/THCUNN/generic/SparseLinear.cu(209): error: too many arguments in function call

2 errors detected in the compilation of “/tmp/tmpxft_000013ba_00000000-7_SparseLinear.cpp1.ii”.
CMake Error at THCUNN_generated_SparseLinear.cu.o.cmake:267 (message):
Error generating file
/tmp/luarocks_cunn-scm-1-3956/cunn/build/lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_SparseLinear.cu.o

make[2]: *** [lib/THCUNN/CMakeFiles/THCUNN.dir/./THCUNN_generated_SparseLinear.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs…
make[1]: *** [lib/THCUNN/CMakeFiles/THCUNN.dir/all] Error 2
make: *** [all] Error 2

Error: Build error: Failed building.


(Guillaume Klein) #4

It seems some packages are out of sync. The following should work:

luarocks install torch
luarocks install cutorch
luarocks install nn
luarocks install cunn

If not, just remove your Torch installation and reinstall from scratch.


(Terence Lewis) #5

Yes, that did the job. It’s started training from epoch 1 - always a good sign :slight_smile: