Speeding up the training time

zlindaz2411 · March 1, 2021, 1:04am

I’m using OpenNMT-tf and I’m training on a corpus with 2 million sentences. I’m using the default parameters and I’m training on Google CoLab. However, the training time is really slow and it takes more than 3 hours to only finish 5000 steps. I am wondering if there is any way to speed up the training time…

guillaumekln · March 1, 2021, 8:42am

Steps per hour is not a good metric to measure training speed as it depends on the batch size and gradient accumulation. What are the reported “source tokens/s” and “target tokens/s”?

zlindaz2411 · March 1, 2021, 9:41am

I’m completely new to OpenNMT, not sure if these are the source tokens and target tokens.
Step = 100 ; steps/s = 0.09, source words/s = 1608, target words/s = 1724 ; Learning rate = 0.000012 ; Loss = 10.085119

I have the source vocabulary and target vocabulary each containing 50000

guillaumekln · March 1, 2021, 10:01am

Can you post the full training logs?

Note that the default GPU on Google Colab is not very powerful, but it’s always possible to make the training faster as long as you can accept the translation quality to decrease.

zlindaz2411 · March 1, 2021, 10:08am

I don’t have the full training logs. Yes, I can accept the translation quality to decrease.

guillaumekln · March 1, 2021, 1:05pm

Can you at least say which model type you used for the training? To make the training faster, you can just select or define a smaller model.

zlindaz2411 · March 1, 2021, 1:59pm

I’m using Tranformer