CTranslate2 2.0 release

A few days ago we released a new major version for CTranslate2. It contains a collection of small breaking changes to make it easier to add new features and improvements. Here are the key points:

  • Update the Python wheels to CUDA 11 and reduce their size:
    • Linux: 77MB → 22MB
    • macOS: 35MB → 5MB
  • Support conversion of Transformer models trained with Fairseq
  • Support conversion of more Transformer variants: Post-Norm, GELU activations
  • Automatically detect the model specification when converting OpenNMT-py models
  • Add new decoding options to improve the beam search output in some cases
  • Improve the C++ asynchronous translation API and add a wrapper to buffer and batch incoming inputs

See more details in the latest release notes:

In most cases upgrading should be very easy. Please let me know if you have any issues or questions!


@guillaumekln Thanks and congratulations! Would you therefore advise me to stop my Windows building and start again with this major version seeing there are breaking changes? Regards, Terence

You probably don’t need to start again. The breaking changes are quite small and I don’t think that any changes would actually impact your Windows building work. So you can update whenever you want.

1 Like

Nice! Looks like upgrading Argos Translate should be pretty straightforward.


This Ctranslate2 version requires CUDA 11 to work. If we are using the server from OpenNMT-py I can’t use this newer version. OpenNMT-py requires torch==1.6 which is only available for CUDA<=10.2.

Is there any solution, apart from using Ctranslate2’s previous version?


In practice OpenNMT-py should work with newer PyTorch versions. So using a more recent PyTorch version compiled with CUDA 11 is one solution.

Another solution is to recompile the CTranslate2 package with an older CUDA version. We are doing that internally since we also can’t update to CUDA 11 for other reasons.

Installing a newer Pytorch version (1.9) says OpenNMT-py requires 1.6.

To install OpenNMT-py I use pip:

pip install OpenNMT-py
pip install ctranslate2

You can upgrade PyTorch after installing OpenNMT-py. You will get a warning as OpenNMT-py says it requires 1.6, but the translation server should work fine.

There is an open issue for OpenNMT-py to relax its PyTorch requirement:

Hi @guillaumekln ,

Trying a fresh build on a Mac M1, so I updated to the most recent CTranslate2 version. Where is the max_batch_size option? :slight_smile:

NVM, found it, sorry for the noise.

To train a model Pytorch > 1.6 does not work. Can’t we train a model with CUDA 11?

pkg_resources.DistributionNotFound: The 'torch==1.6.0' distribution was not found and is required by OpenNMT-py

I don’t think PyTorch 1.6 was released with CUDA 11. Alternatively, you can run the training in a Docker image such as pytorch/pytorch:1.6.0-cuda10.1-cudnn7-runtime so that you don’t need to install multiple CUDA versions on the host system.

Note that this is not directly related to CTranslate2, so I suggest that you open another topic or post on the GitHub issue referenced above.

4 posts were split to a new topic: Convert M2M model to CTranslate2

  • Improve the C++ asynchronous translation API and add a wrapper to buffer and batch incoming inputs

Is their any sample as example available for this?

It looks like you have the same question in Async translation for ctranslate2 model. Let’s continue the discussion there.