CTranslate2 C++ API returns strange results when initializing 2 models


Following the CTranslate2 documentation, I tried to use the C++ API to load converted models and translate tokens. However, it seems that when loading two models at the same time, the second translation becomes erroneous.

Namely, if we define

 void f() {
    ctranslate2::Translator translator("ende_ctranslate2/", ctranslate2::Device::CPU);
    ctranslate2::TranslationResult result = translator.translate({"▁H", "ello", "▁world", "!"});

    for (const auto& token : result.output())
        std::cout << token << ' ';
    std::cout << std::endl;

and then

void g() {
    ctranslate2::Translator translator2("tests/data/models/v2/aren-transliteration", ctranslate2::Device::CPU);
    ctranslate2::TranslationResult result = translator2.translate({"آ" ,"ت" ,"ز" ,"م" ,"و" ,"ن"});

    for (const auto& token : result.output())
        std::cout << token << ' ';
    std::cout << std::endl;

Evaluating first f then g (in a main function) I got

▁Hallo ▁Welt ! 
m u s t 

as output, whereas evaluating first g then f I got

a t z m o n 
▁Hallo !

It is strange that the first translation could impact the second one.

Could you please give some insight on this?

Thank you!


That’s interesting. I can reproduce the issue: it more specifically happens when loading 2 models with different dimensions (here aren-transliteration is a smaller Transformer model). Thanks for testing and reporting this case!

I have identified the issue and will push a fix soon.

Should be fixed with:

1 Like

Hi Guillaume,

Thank you for your investigation and the quick fix!

By the way, it seems that the Python API does not suffer from this issue (tested with the Python shell in the latest ubuntu-18 docker image)? Is it because the underlying wrapper uses TranslatorPool instead of Translator?

Yes. It happened when running 2 models in the same thread but TranslatorPool creates new threads for each model.