Is it possible to do both: POS tagging (as a feature) and BPE/SentencePiece?
If yes. How do we do it? And does it yield positive results?
My understanding is that your feature file needs to have the same number of tokens as the training file. Yet the training file words are broken down in smaller token with BPE/SentencePiece. So at first glance, it seem impossible to do both at the sametime… except if we repeat the POS tag on each part of the word?
I’m not even sure that would be usefull… but if anyone as insigh regarding if it’s doable and how to do it, it would be really appreciated.