OpenNMT Forum

Proper configuration for server

Hi,
I’ve been digging around for a while in code integration but it is not clear to me which arguments are necessary. I guess “model” and “ct2_model” are not required at the same time…
Thanks

Hi there,

You can refer to the PR in which it was introduced.

Thanks!

Does it have ensembling options as Pytorch models have?

“model” -> “models”
“ct2_model” -> “ct2_models”??

Hmm I don’t think ensemble decoding is supported in CTranslate2. Not sure if it’s intended to be supported at some point @guillaumekln?

We don’t have plans to add ensemble decoding in CTranslate2.

Ok. Thanks!

Hi,

How can I load the Ctranslate2 model into cpu only (device option in CLI)? Are there any options in server for inter_threads and intra_threads?

EDIT:
I’ve seen setting gpu option to -1 it executes the translations in CPU.

Furthermore, inter_threads and intra_threads options are harcoded, both to 1. Changing these values to 4 respectively, does’t show the same behaviour as from the python API translate_file.

As stated in Ctranslte2 repository:

For CPU translations, the parameter inter_threads controls the number of batches a Translator instance can process in parallel. The translate_file method automatically takes advantage of this parallelization.
However, extra work may be needed when using the translate_batch method because multiple translations should be started concurrently from Python. If you are using a multithreaded HTTP server, this may already be the case. For other cases, you could use a ThreadPoolExecutor to submit multiple translations

Translate batch is used in the sever but it is not parallelized, therefore inter_threads could only be 1

How did you verify it is not parallelized? Did you send parallel translation requests to the server?