Hi All,
I want to run the OpenNMT on the MAX OXS with GPU (GT750M), but can’t work in the step 3 (“Step 3: Translate”)? OpenNMT can’t run the OXS? Thanks.
If you successfully installed Torch, OpenNMT should work out of the box.
What is the particular issue?
I successfully installed the torch and can run the OpenNMT’s Training model, but failure when translating. The error log is blow:
$: th translate.lua -model model_final.t7 -src data/src-test.txt -output pred.txt
[04/16/17 23:15:17 INFO] Loading 'model_final.t7'...
/Users/lvy/torch/install/bin/luajit: cannot open <model_final.t7> in mode r at /Users/lvy/torch/pkg/torch/lib/TH/THDiskFile.c:670
stack traceback:
[C]: at 0x0bdb02d0
[C]: in function 'DiskFile'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:405: in function 'load'
./onmt/translate/Translator.lua:86: in function '__init'
/Users/lvy/torch/install/share/lua/5.1/torch/init.lua:91: in function 'new'
translate.lua:60: in function 'main'
translate.lua:182: in main chunk
[C]: in function 'dofile'
.../lvy/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x010bc8cd60
It just seems it can’t find the model file. Did you check that the path to model_final.t7
is correct?
Sorry, I make a mistake to paste a wrong log info. I meet with the issue with log as below:
04/22/17 23:37:56 INFO] Using GPU(s): 1
[04/22/17 23:37:56 WARNING] The caching CUDA memory allocator is enabled. This allocator improves performance at the cost of a higher GPU memory usage. To optimize for memory, consider disabling it by setting the environment variable: THC_CACHING_ALLOCATOR=0
[04/22/17 23:37:56 INFO] Loading 'iwslt_en2zh_epoch24_19.29.t7'...
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-300/cutorch/lib/THC/generic/THCStorage.cu line=66 error=63 : OS call failed or operation not supported on this OS
/Users/lvy/torch/install/bin/luajit: /Users/lvy/torch/install/share/lua/5.1/torch/File.lua:351: cuda runtime error (63) : OS call failed or operation not supported on this OS at /tmp/luarocks_cutorch-scm-1-300/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
[C]: in function 'read'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:351: in function </Users/lvy/torch/install/share/lua/5.1/torch/File.lua:245>
[C]: in function 'read'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/Users/lvy/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
...
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/Users/lvy/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
./onmt/translate/Translator.lua:86: in function '__init'
/Users/lvy/torch/install/share/lua/5.1/torch/init.lua:91: in function 'new'
translate.lua:60: in function 'main'
translate.lua:182: in main chunk
[C]: in function ‘dofile'
.../lvy/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x0105c7ed60
the problem seems to be related to configuration/support of the GPU on your computer. Can you try the following commandline:
th -e "require 'cutorch'"
Thanks, I run the command th -e "require 'cutorch'"
in my environment, but there isn’t any outputting.
I train the model in Nvidia K80 GPU, but translate the model in the Nvidia GTX 750M GPU, Does it produce the error?
Hi @preston_li - I think the issue comes from some incompatibility of cutorch with your local GPU but it is hard to diagnose deeper. I don’t think it comes from the difference with the training environment - but just to double check is it possible to run the demo training on your GPU locally?
I run OpenNMT demo also produce similar error. I will double check the cutorch. My work machine GPU is Nvidia GTX 750M and install CUDA 7.5. Thanks.