OpenNMT Forum

Is there any way to fine tune Pretrained language models on opennmt?

Right know the state of the art for nlp comes from fine tuning huge pre trained language models like
Bert,Roberta, xlnet, ernie 2.0?
Is there any way to fine tune the a custom model loading the pretrained models?

Hey lockder!
What do you mean with fine-tune a language model? OpenNMT is a seq2seq model, therefore fine-tuning isn’t that stright-foreword like other nlp tasks and the Discussion about BERT dozed off. You could use the context embedding as an additional input feature or use the language model for ranking and data augmentation.

There is also the MASS pretraining for seq2seq models. We could try to fine-tune those pretrained models with openNMT-py.

thanks for the links
I mean right know when you can read a paper, bert (or any other contextual pre trained model , can be used also for text classification or Ner, so you have to retrain the model and fine tune for your current objective). This means load the model then trained again for your current objective

right know google publlished a better pretrained contextual language model

I think that ALBERT is only trained for English. Pytorch released last week the pretrained xml-r which was tested initialy with (unsupervised) NMT. But i don’t know if the model definition is usable in openNMT.

There is a general paper for pretraining seq2seq taks. In most nmt cases the use of the pretrained encoder gives good results. To initialize the decoder, we could try the copied monolingual data.

In which language pairs and domains are you interested?

well right know I’m doing english and spanish but I’m focused on Text Classification and Named Entity Recognition. Maybe replicating the model we can load directly the pretrained weights ? Not sure a solution