Choosing between OpenNMT, OpenNMT-py, and OpenNMT-tf for Domain Adaptation

Hi!

I need to do Domain Adaptation / incremental training first on a larger dataset, and then a smaller more domain specific dataset.

I was wondering which is a better (more complete) framework for this task - opennmt, openmnt-py, or opennmt-tf ?

From what I have seen, it appears that opennmt is no more under development (the github page says so). Opennmt-py does not support some options like source / target vocabularies needed for domain specific adaptation (read that on a forum post here). So it appears that one should go with opennmt-tf for this task.

Am I correct? Can someone kindly throw more light on this?

Thanks!

Hi,

  • If you need to inject new domain specific tokens in the vocabulary, then you should use OpenNMT-tf which supports vocabulary update and replacement.
  • If you just need to continue training on new data, possibly with other optimization settings, all versions can do that.

Thank you for the insight - I am now trying to set up OpenNMT-tf.