Multiple gpu for translation

Can i use multiple gpu for translation server? if yes, how to do that?

If you use OpenNMT-py
Please add below option

–gpuid, -gpuid

Deprecated see world_size and gpu_ranks.

Default: []

–gpu_ranks, -gpu_ranks

list of ranks of each process.

Default: []

–world_size, -world_size

total number of distributed processes.

Default: 1

If you use OpenNMT-tf please add below option

–num_gpus 4

And set


hi park,
is it applicable for translation as well? I know that while training I can employ multiple GPUs but what I am asking here is for translation only??

MultiGPU can be applied to all tasks .

i tried adding it to conf.json for my translation server like this but it was accepting only one gpu argument not an array.
“models_root”: “./available_models”,
“models”: [
“id”: 1,
“model”: “”,
“timeout”: 600,
“on_timeout”: “to_cpu”,
“load”: true,
“opt”: {
“beam_size”: 5,

can you tell what am i missing?

That’s not correct. Only training can make use of multiple GPUs.

For a translation server, the recommended approach is to start a server on each GPU and have an external system that does load balancing. This external system is not part of OpenNMT.

1 Like

Okay Thanks for you apply

In case anyone is searching for this error message I had a similar issue. I copied gpu_ranks: [0, 1] and world_size: 2 from the docs with one GPU and got the error:

RuntimeError: cuda runtime error (101) : invalid device ordinal at /pytorch/torch/csrc/cuda/Module.cpp:59

Changing to gpu_rank: [0] and world_size: 1 fixed the issue for me.