I recently switched to CTanslate2, and converted my OpenNMT-tf TransformerBig model using CTranslate2, but I had a runtime error related to cuBLAS:
this error happens when I’m translating a batch of sentences or just a single sentence, the model can be loaded without problem at all.
Below is my related environment:
tensorflow-gpu version: 2.3.1
CUDA version: 10.1
Nvidia Driver version: 430.64
NVCC version: 7.5.17
CTranslate2 version: 1.16.1
GPU model: RTX2080Ti with 11GB VRAM
OS version: Linux 16.04
I have 7 GPUs available, no tasks are running on them, I tried every one of them, all produce the same error, running the old OpenNMT-tf version on them has no problem at all.
Does anyone else experience the same problem?
The code I used to run it is as follows: