Hi, I want to use opennmt to do some prediction. However, I found that the beam search module account for the most major proportion of the time consumption. Is there any suggestion except diminish the beam search and batch size?
Increasing the batch size is actually a way to decrease the average translation time per sentence (and increase the memory usage too).
As for the beam search, it is costly mostly because we have to forward -beam_size
times more sequences into the network. The only trick I can think of right now is to not use beam search if translation speed is critical.
Are you running on GPU?
I am running in CPU. Is there any other method that I can use instead of beam search.
You can use greedy search, (beam 1). Performance will go up but accuracy will go down.
Thanks, I will try that.