I’d like to train a translation model for a specific domain. I have a large collections of out-of-domain parallel data from various sources (~20M pairs). There is only a little noisy in-domain data available (~100k). I understand that a common practice in domain adaptation for NMT is “fine-tuning” an out-of-domain model on the in-domain data. However, since my in-domain data is of low quality, I wonder if there is some good alternative for adapting my NMT model?