Python and OpenNMT

Hi everyone. A brief announcement on some future OpenNMT plans/goodies.

The Facebook research team has some amazing programmers, and just for kicks they have ported OpenNMT entirely into Python/PyTorch as an example project. It’s really neat, and it demonstrates a lot of the nice properties of their newly released PyTorch environment.

The release of this code will not change our current support for current OpenNMT. We really like the standard stable Lua/Torch environment and will continue to develop in it. Going forward this will continue to be the main language of the project.

With this in mind, PyTorch is an exciting new project and introduces many interesting features while using the same Torch backend. We want to experiment with what these features will add to OpenNMT and also build up their version to have the same features and hopefully compatibility with the main codebase. We are thinking of this now as a “research” branch of OpenNMT, as PyTorch will give more flexibility for trying out new model architectures.

Currently our forked version of this project is at: https://github.com/OpenNMT/pyopennmt

If you are interested in contributing, please join our gitter channel at: https://gitter.im/OpenNMT/openmt

5 Likes

Can the Pytorch implementation replicate the benchmark results of OpenNMT?

Initial results look promising, but we haven’t confirmed full numbers yet.

Hi, a quick question. Does PyTorch require nVIDIA GPU installed? Thanks!

no it does not - you can run PyTorch on CPU.

Hi, thank you a lot for help!

I installed PyTorch with Docker image pytorch-cudnnv6 on a VM following [https://github.com/pytorch/pytorch#installation].

Then I tried to translate a test text using the pretrained model ‘onmt_model_en_fr_b1M’ published on https://github.com/OpenNMT/OpenNMT-py with the command:
python translate.py -model …/onmt-model/onmt_model_en_fr_b1M-261c69a7.pt -src …/test.txt -output …/test.tok, but failed with the following error:

Traceback (most recent call last):
File “translate.py”, line 116, in
main()
File “translate.py”, line 55, in main
translator = onmt.Translator(opt)
File “/root/OpenNMT-py/onmt/Translator.py”, line 11, in init
checkpoint = torch.load(opt.model)
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py”, line 222, in load
return load(f, maplocation, pickle_module)
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py”, line 355, in _load
return legacy_load(f)
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py”, line 300, in legacy_load
obj = restore_location(obj, location)
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py”, line 85, in default_restore_location
result = fn(storage, location)
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py”, line 67, in cudadeserialize
return obj.cuda(device_id)
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/_utils.py”, line 56, in _cuda
with torch.cuda.device(device):
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/cuda/init.py”, line 136, in enter
lazyinit()
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/cuda/init.py”, line 96, in lazyinit
checkdriver()
File “/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/cuda/init.py”, line 70, in checkdriver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

It looks having no GPU support. Yes, I’m using a VM, so don’t have GPU. Is this right?

My question is: how can I use CPU with OpenNMT commands in this environment so as to avoid the error and get successful?

I then tried compiling and installing PyTorch from source without CUDA support on a VM following [https://github.com/pytorch/pytorch#installation], and executed the same command above again. And also I got error below:

Traceback (most recent call last):
File “translate.py”, line 123, in
main()
File “translate.py”, line 56, in main
translator = onmt.Translator(opt)
File “/root/OpenNMT-py/onmt/Translator.py”, line 12, in init
checkpoint = torch.load(opt.model)
File “/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py”, line 229, in load
return load(f, maplocation, pickle_module)
File “/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py”, line 362, in _load
return legacy_load(f)
File “/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py”, line 307, in legacy_load
obj = restore_location(obj, location)
File “/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py”, line 85, in default_restore_location
result = fn(storage, location)
File “/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py”, line 67, in cudadeserialize
return obj.cuda(device_id)
File “/root/anaconda3/lib/python3.6/site-packages/torch/_utils.py”, line 57, in _cuda
with torch.cuda.device(device):
File “/root/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py”, line 129, in enter
lazyinit()
File “/root/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py”, line 89, in lazyinit
checkdriver()
File “/root/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py”, line 56, in checkdriver
raise AssertionError(“Torch not compiled with CUDA enabled”)
AssertionError: Torch not compiled with CUDA enabled

I have another question:

The translation command on http://opennmt.net/PythonGuide/ is:
python translate.py -model model_final.pt -src data/src-val.txt -output file-tgt.tok

On https://github.com/OpenNMT/OpenNMT-py, it’s
python translate.py -gpu 0 -model demo_model_e13_*.pt -src data/src-test.txt -tgt data/tgt-test.txt -replace_unk -verbose -output demo_pred.txt

Is the -tgt argument in the second command needed? Why? The -output argument should be the target for translation, right?

Thanks!

Are you using a recent version? These issues seem to have been fixed.

yes Guillaume, I freshly cloned https://github.com/OpenNMT/OpenNMT-py.git each time on my environment.

Currently, GPU trained models can not be loaded on a PyTorch install without CUDA support. Could you open an issue on GitHub?

-tgt is not needed and -output defines the file used to store the translated sentences.

@guillaumekln I got it. Ok, I will open an issue on GitHub. Thanks!

Were you able to verify the numbers?