I would like to consult you about a specific use case of domain adaptation. Our standard approach for domain adaptation is to finetune generic models, which works quite well. However, we would like our engines to further differentiate between two specific subdomains. In this case, it would not be very efficient to create additional finetuned models for those subdomains, so we are considering other approaches such as using tags in the data to make the model aware of the subdomain, when there is one. In that case, I assume we would need to eventually inject those custom tags at inference time when the subdomain is known.
Has anyone tried this approach with OpenNMT-tf? Would adding a custom tag to the start of the examples work, or we should consider something more aggressive? And, in the cases where no tags are added at inference time, would the quality be impacted (given the fact that the model has been partially trained with tags)?
At this moment, we are successfully using a TransformerBig model with aggressive pretokenisation and BPE. In-domain volumes vary from 400k to 800k sentence pairs. For the decoding, we use CTranslate2 engines.
I guess this requires experimentation above all, but any hint on how to start or where to look at will be greatly appreciated!