I am working on the multi-node training for the Open NMT. I have two 4-GPU devices. If I want to train the Open NMT on these two nodes (each node with 4 GPU). Do I just set the world size as 8 and gpu_ranks as [0, 1, 2, 3, 4, 5, 6, 7]. Are there any other places I need to change for training on two nodes with 4 GPUs each? In addition, I am using OpenMpi as the communicator. Thanks!