What would be the optimal way to combine BPE with sequence tagging? I suppose that if we split up the input side in smaller parts, we have to do copy the target tags over the BPE parts, or should we rather use Beginning-Inside-Outside (BIO) tags as is often done in chunking or NER?
1 Like