Is it possible to use a vocabulary from a dictionary separate from the training data, and then train with separate training data?
Hey @leokonst
Yes, both OpenNMT-py and OpenNMT-tf can be given pre-existing vocabs to build from.
For OpenNMT-py you can have a look at the -src_vocab/ -tgt_vocabflags.
For OpenNMT-tf you set source_vocabularyand target_vocabulary in the config.
Small tip if you’re new to NMT, subword methods (sentencepiece/BPE) are most of the time better than word vocabs.
Definitely. That’s a tip worth following 