OpenNMT Forum

Combining BPE and factored model

Is there a way to combine using the BPE (byte pair encoding) with a factored model, for instance with word|lemma|pos combination?

Hello @vincent,

You can use features with BPE model - and in that case, you can repeat the “factors” (what we call features) for each subtoken:

for instance for the sentence:

How│WRB can│MD I│PRP encode│VB an│DT audio│JJ file│NN ?│?

will become through BPE and case markup:

how│C│WRB can│L│MD i│C│PRP en■│L│VB code│L│VB an│L│DT audio│L│JJ file│L│NN ?│N│?

thank you – I’ll try it as soon as my gpus are free :slight_smile:

Since this thread is from 2017, I was wondering if factored NMT is still supported in the current version of OpenNMT (as of April 2021)?