Domain Adaptation can be useful when you want to train a Machine Translation model, but you have only too limited data.
There are several approaches of Domain Adaptation including:
- Incremental Training / Re-training: So you have a big pre-trained model trained on a big corpus, and you continue training it with the new data from the small corpus.
- Ensemble Decoding (of two models): You have two models and you use both models during translation.
- Combining Training Data: You merge the two corpora and train one model on the whole combined data.
- Data Weighting: You give higher weights for specialized segments over generic segments.
In this tutorial, I explained how to apply these techniques and the best practices:
If you have questions or suggestions, please let me know.