I’ve been digging around for a while in code integration but it is not clear to me which arguments are necessary. I guess “model” and “ct2_model” are not required at the same time…
You can refer to the PR in which it was introduced.
Does it have ensembling options as Pytorch models have?
“model” -> “models”
“ct2_model” -> “ct2_models”??
Hmm I don’t think ensemble decoding is supported in CTranslate2. Not sure if it’s intended to be supported at some point @guillaumekln?
We don’t have plans to add ensemble decoding in CTranslate2.
How can I load the Ctranslate2 model into cpu only (device option in CLI)? Are there any options in server for
I’ve seen setting gpu option to -1 it executes the translations in CPU.
intra_threads options are harcoded, both to 1. Changing these values to 4 respectively, does’t show the same behaviour as from the python API
As stated in Ctranslte2 repository:
For CPU translations, the parameter
inter_threadscontrols the number of batches a Translator instance can process in parallel. The
translate_filemethod automatically takes advantage of this parallelization.
However, extra work may be needed when using the
translate_batchmethod because multiple translations should be started concurrently from Python. If you are using a multithreaded HTTP server, this may already be the case. For other cases, you could use a ThreadPoolExecutor to submit multiple translations
Translate batch is used in the sever but it is not parallelized, therefore
inter_threads could only be 1
How did you verify it is not parallelized? Did you send parallel translation requests to the server?