It can finish training Transformer with 12 layers within 24 hours

abas · January 3, 2021, 8:28am

hi all,
It can finish training Transformer with 12 layers within 24 hours in google colab with dataset contain(2,5M).

ymoslem · January 4, 2021, 5:40pm

Dear Abas,

I did not try training that much on Google Colab, but in general I do not think so. Plus, the free version of Google Colab allows 12 hours only anyhow. You can still use train_from to continue what you started although I do not think it is a practical solution.

As for 12 layers on 2,5M segments, unless you are following a specific paper, this seems too many layers for this amount of segments.

It is clear you are new to OpenNMT and this is okay; however, it makes sense to start with a simple project and make sure you master it before you move to a bigger project.

Finally, if you need help, you need to try things first yourself and be specific about the problem you have, explaining: 1) I did so; 2) I expected so; but 3) I get so. Otherwise, it would be difficult for people to “predict” how to help you even if they want to.

All the best,
Yasmin

abas · January 6, 2021, 10:43am

thank youu,