No success in using GPU

Norbert · February 27, 2019, 3:46pm

Hello,
I am desperate. I have a new Geforce GTX 1070 with 8GB RAM and want to use it for OpenNMT -torch. My training command is: th train.lua -data Testdaten/Test_Update_de_en-train.t7 -save_model Testdaten/Test_Update_DE_EN-model -gpuid 1

I thougt this correct. But I get this result and I don’t know what can I do against this. There is something with cutorch in the error message,

Training wird gestartet
/home/user/torch/install/bin/luajit: ./onmt/utils/Cuda.lua:76: /home/user/torch/install/share/lua/5.1/trepl/init.lua:389: module ‘cutorch’ not found:No LuaRocks module found for cutorch
no field package.preload[‘cutorch’]
no file ‘/home/user/.luarocks/share/lua/5.1/cutorch.lua’
no file ‘/home/user/.luarocks/share/lua/5.1/cutorch/init.lua’
no file ‘/home/user/torch/install/share/lua/5.1/cutorch.lua’
no file ‘/home/user/torch/install/share/lua/5.1/cutorch/init.lua’
no file ‘./cutorch.lua’
no file ‘/home/user/torch/install/share/luajit-2.1.0-beta1/cutorch.lua’
no file ‘/usr/local/share/lua/5.1/cutorch.lua’
no file ‘/usr/local/share/lua/5.1/cutorch/init.lua’
no file ‘/home/user/.luarocks/lib/lua/5.1/cutorch.so’
no file ‘/home/user/torch/install/lib/lua/5.1/cutorch.so’
no file ‘/home/user/torch/install/lib/cutorch.so’
no file ‘./cutorch.so’
no file ‘/usr/local/lib/lua/5.1/cutorch.so’
no file ‘/usr/local/lib/lua/5.1/loadall.so’
stack traceback:
[C]: in function ‘error’
./onmt/utils/Cuda.lua:76: in function ‘init’
train.lua:300: in function ‘main’
train.lua:338: in main chunk
[C]: in function ‘dofile’
…user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
Training fertig

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
±----------------------------------------------------------------------------+

guillaumekln · February 27, 2019, 8:31pm

Hi,

You probably did not install Torch with CUDA support.

I see that you used a Docker command. Why not using the OpenNMT Docker image?

Norbert · February 28, 2019, 9:26am

Hi,

I see no reason not to use OpenNMT Docker image. I’ll try the command from the Documentation: sudo nvidia-docker run -it opennmt/opennmt:latest

My prompt change to root@b27c587d21df:~/opennmt# Is it ok? I have no idea how to copy my translation data to this location. Sorry I am coming from Windows

Norbert

guillaumekln · March 2, 2019, 9:07am

Hi,

Maybe you want to learn more about Docker and how to mount host directories: