To TRAIN I can use CUDA and:
CUDA_VISIBLE_DEVICES=0,1,2,3 onmt_train -data . . . . -world_size 4 -gpu_ranks 0 1 2 3 &
but TRANSLATE doesn’t have world or gpu options. Does it? It doesn’t as far as I can tell.
Q1. So how EXACTLY do I run translate on a GPU (even a single GPU) if onmt_translate doesn’t have world/gpu options?
Q2. Is it true (according to this forum) that translate only works with one GPU at a time?
Q3. Is it true that translate on a CPU is really slow as I have discovered? eg about 600ms per sentence?