I recently switched to CTanslate2, and converted my OpenNMT-tf TransformerBig model using CTranslate2, but I had a runtime error related to cuBLAS:
this error happens when I’m translating a batch of sentences or just a single sentence, the model can be loaded without problem at all.
Below is my related environment:
tensorflow-gpu version: 2.3.1
CUDA version: 10.1
Nvidia Driver version: 430.64
NVCC version: 7.5.17
CTranslate2 version: 1.16.1
GPU model: RTX2080Ti with 11GB VRAM
OS version: Linux 16.04
I have 7 GPUs available, no tasks are running on them, I tried every one of them, all produce the same error, running the old OpenNMT-tf version on them has no problem at all.
I finally managed to reproduce the issue by testing with an older GPU driver. The version of the cuBLAS library included in the Python package was incorrect. It should be fixed in the latest version 1.16.2. You can update the package and try again.
pip install --upgrade ctranslate2
This is unrelated to the current topic. You may want to open another one. (The random text is the ID of the Docker container that is running in background.)