Model.bin file from Opus-MT conversion not cross-platform?

Hi, the conversion script works perfectly on Windows 11 and Ubuntu 20.04 but does not appear to be cross-platform, i.e. I need to do a separate conversion for each platform. Is that the case?
It’s no big deal but the ctranslate2 models converted from OpenNMT-tf have been cross-platform in my experience.

Hi,

What do you mean by “not cross-platform”? What’s the issue?

Well, I generate the model.bin file on my Windows laptop and it translates perfectly. If I copy that model with the *.spm files to my Linux machine it just gives me garbage. I need to do a separate conversion on the L:inux machine.

Maybe the CTranslate2 version on the Linux machine is older than the version on the Windows machine?

No, both are 2.14 and both resulting models translate perfectly when converted on the respective machines. It’s no big deal, just curious :slight_smile:

I think I can reproduce this issue. The model converted on Linux works on both Linux and Windows, but the model converted on Windows produces incorrect results on Linux.

Thanks, Guillaume. Now I know I’ll start the conversions on Linux…

The issue is Python on Windows silently converts the newline character \n to \r\n. So the vocabulary files generated on Windows are not correctly loaded on other platforms.

I updated the converter to always use \n as the newline character:

1 Like