Great work @guillaumekln to bring something like CTranslate2 for model inference optimization.
I have been using openNMT for training MT models, came across CTranslate2 recently and gave a try to get inference on CPU.
As a first step, I was successful to convert my model to int16 quantization ctranslate2 model based on discussion in forum.
And, I am not able to use the converted model for inference.
We can get inference using docker image as seen on readme file:
docker run --rm opennmt/ctranslate2:latest-ubuntu18-gpu --model /data/ned2eng123_ctranslate2/
Docker containers can’t access files from the host unless there are mounted with the --volume option.
The easiest is that you copy testfile.txt in //c/Users/sprakash/Documents/ned2eng123_ctranslate2 and reference it as /data/testfile.txt on the command line.
Yes.Its working now.Thanks a lot.
But, seems output file gives strange results like “_2” for “Hello world” in testfile.txt (input). I am using pre-trained English - German model from openNMT for inference.
Sorry for delayed response. I did try with sentencepiece tokenized input text and mounted it along with converted model using docker image. As a response, I am getting all UKN tokens for each corresponding token in source file.