Is there a way to combine using the BPE (byte pair encoding) with a factored model, for instance with word|lemma|pos combination?
Hello @vincent,
You can use features with BPE model - and in that case, you can repeat the “factors” (what we call features) for each subtoken:
for instance for the sentence:
How│WRB can│MD I│PRP encode│VB an│DT audio│JJ file│NN ?│?
will become through BPE and case markup:
how│C│WRB can│L│MD i│C│PRP en■│L│VB code│L│VB an│L│DT audio│L│JJ file│L│NN ?│N│?
thank you – I’ll try it as soon as my gpus are free
Since this thread is from 2017, I was wondering if factored NMT is still supported in the current version of OpenNMT (as of April 2021)?
Thanks.