OpenNMT Forum

Runtime error when translating using ctranslate2

Hello,

I recently switched to CTanslate2, and converted my OpenNMT-tf TransformerBig model using CTranslate2, but I had a runtime error related to cuBLAS:
image
this error happens when I’m translating a batch of sentences or just a single sentence, the model can be loaded without problem at all.
Below is my related environment:
tensorflow-gpu version: 2.3.1
CUDA version: 10.1
Nvidia Driver version: 430.64
NVCC version: 7.5.17
CTranslate2 version: 1.16.1
GPU model: RTX2080Ti with 11GB VRAM
OS version: Linux 16.04

I have 7 GPUs available, no tasks are running on them, I tried every one of them, all produce the same error, running the old OpenNMT-tf version on them has no problem at all.

Does anyone else experience the same problem?

The code I used to run it is as follows:

Hi,

Thanks for reporting and providing additional information.

This was also reported in the issue below, but I was not able to reproduce yet:

In both report, the system has multiple GPUs. Could you try setting device_index to 0 and running your script with CUDA_VISIBLE_DEVICES=0 python ...? Do you still get the same error?

Hi thanks for the quick reply!

Unfortunately I still get the same error. Also, unlike the other reported issue, I’m not running this through docker, just the simple python code which is slightly modified from the python instructions, it’s in the “Translation API” section of the following link:
[https://github.com/OpenNMT/CTranslate2/blob/master/docs/python.md#model-conversion-api]

Right now I’m trying the docker solution, ran the provided sample code, but only got a random text output like this:
image

My Docker version is 19.03.8
I followed every step in the following link:

Nothing happened after step 3, I only got an output of that number&character combination like shown in the screenshot.

I finally managed to reproduce the issue by testing with an older GPU driver. The version of the cuBLAS library included in the Python package was incorrect. It should be fixed in the latest version 1.16.2. You can update the package and try again.

pip install --upgrade ctranslate2

This is unrelated to the current topic. You may want to open another one. (The random text is the ID of the Docker container that is running in background.)

Thanks alot!!! It’s now working like a charm!!

1 Like