I have been doing domain adaptation for some months with opennmt.
Now I realized I am using the same decay warmup configuration for the out-of-domain model and the in-domain model, so I would like to apply the best approach for the in-domain model.
What would you consider is the right warmup strategy for the in-domain training?
How exactly are you starting the training on the in-domain data?
Assuming your are training Transformer models with the automatic learning rate decay strategy (NoamDecay), it is generally enough to just continue the training on the new data without changing anything else.
If you start training a pertained model then it will resume training from the configuration it had stopped. This means that most likely you will be training the model with a very small learning rate. I would suggest to add a warmup period.