I'm not able to use GPU with latest openNMT-tf

I’m not able to use GPU using the latest version of openNMT-tf

onmt-main --model_type TransformerRelative --config data.yml \

–auto_config train --with_eval

Getting the following out put:

2021-01-18 13:09:06.344974: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library ‘libcudart.so.11.0’; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/conda/lib
2021-01-18 13:09:06.346084: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-01-18 13:09:10.662631: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-01-18 13:09:10.663028: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/conda/lib
2021-01-18 13:09:10.663059: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-01-18 13:09:10.663087: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (98cb1e0a5f87): /proc/driver/nvidia/version does not exist
2021-01-18 13:09:11.330078: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-01-18 13:09:16.135954: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-01-18 13:09:16.137949: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2199995000 Hz
2021-01-18 13:11:27.798232: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 392844276 exceeds 10% of free system memory.

TensorFlow 2.4 requires CUDA 11.0.

So you should either install CUDA 11.0 on your system, or use TensorFlow 2.3:

pip install tensorflow==2.3.*
1 Like