OpenNMT

Retraining model with new datasets

We have a trained model on basic dateset, languages dnt matter.
When we try to retrain model with new dataset (for example with medical pairs) blue decreased dramatically for basic dataset validation after firts iteration. Where i can found docs, wp or something else with a right path of retrain and Idealistically without basic datasets presence.

You are looking for domain adaptation techniques.

See this tutorial by @ymoslem:

1 Like

@guillaumekln thx for answer

Incremental Training encounter a case of “catastrophic forgetting”, after that case this topic was created
Ensemble Decoding are not implemented yet.
Combining Training more useful but not for us case (we a’nt basic datasets).

What another type of Domain Adaptation can you advise in case of use ct2 without basic datasets. Maybe someone came across this.