Choosing between OpenNMT, OpenNMT-py, and OpenNMT-tf for Domain Adaptation

add · February 11, 2019, 2:31pm

Hi!

I need to do Domain Adaptation / incremental training first on a larger dataset, and then a smaller more domain specific dataset.

I was wondering which is a better (more complete) framework for this task - opennmt, openmnt-py, or opennmt-tf ?

From what I have seen, it appears that opennmt is no more under development (the github page says so). Opennmt-py does not support some options like source / target vocabularies needed for domain specific adaptation (read that on a forum post here). So it appears that one should go with opennmt-tf for this task.

Am I correct? Can someone kindly throw more light on this?

Thanks!

guillaumekln · February 11, 2019, 2:48pm

Hi,

If you need to inject new domain specific tokens in the vocabulary, then you should use OpenNMT-tf which supports vocabulary update and replacement.
If you just need to continue training on new data, possibly with other optimization settings, all versions can do that.

add · February 12, 2019, 5:30am

Thank you for the insight - I am now trying to set up OpenNMT-tf.