I’m trying to run the example training script to train a English-German model. I’m running Ubuntu 20.04 with a GeForce GTX 780. On a fresh Ubuntu 20.04 instance I checked the box during installation for Ubuntu to install proprietary drivers and the nvidia-driver-440 package was installed. Then I installed Docker (package from apt) and installed NVIDIA Docker support. I ran the TensorFlow Docker image with this command:
sudo docker run -it tensorflow/tensorflow bash
Inside the Docker container I Installed OpenNMT:
pip install --upgrade pip pip install OpenNMT-tf
I compiled and installed SentencePiece, then I ran the baseline training script. prepare_data.sh ran with no problems (but didn’t use the GPU), but when I ran the 1 GPU training script the GPU wasn’t being used:
I can’t tell if this is some sort of drivers issue, an issue with my TensorFlow installation, or an issue with OpenNMT. Any help greatly appreciated!
I had previously tried installing the GPU drivers (from the NVIDIA website), CUDA, cuDNN, and TensorFlow outside of Docker and running the training script but had the same problem of the GPU not being used. My next step was to try running the OpenNMT Docker image and see if I can get that to work but I’d like to have the control of running OpenNMT outside of Docker.
Thanks for any help!