To solve the brnn training problem, I tried to using only one GPU for training, and everything works fine this time.
However, I found that the training speed is actually a little faster than using two GPU.
For confirming that, I trained a new model with all default parameters again (all the commands just like quick start, only change the data). Now I am pretty sure, when training a seq2seq model on my data, two GPU is slower than one GPU.
Could somebody tell me, is this normal? why did this happen? or, did I do something wrong ?
With the default settings, the trained model is fairly small and the sequences are not very long which means it’s relatively cheap to compute one iteration. On the other hand, it’s not cheap to transfer data from one GPU to another, especially on consumer-grade cards.
To gain a speedup you want to reduce the data transfer time (better hardware or GPU libraries) and increase the computation time (larger model, batch size, etc.).