Increasing the speed of translation

I am currently hosting a trained model online and trying to translate stuff with the model on-the-go. I have lowered the beam_size, but I realized a lot of time is wasted on initializing the model over and over again. Does anyone know:

  1. How else can I increase the speed of translation?
  2. How can I configure the model or translate.lua to stay active and “listen” for calls for translation so that the model need not spend time being terminated and reinitialized subsequently?


1 Like

Are you translating on CPU or GPU?

You probably you want to use one of the provided translation servers.

I’m using GPU to translate. I’ve tried using the translation servers and it works perfectly, thank you.


What kind of client/server setup are you using? Is a single sentence task or a multiple sentence task?

REST API server is pretty fast for a sentence by sentence task, but not sure what happens with a huge model.

Have a nice day
miguel canals