Once the parallel data is extracted for subword model GitHub - rsennrich/subword-nmt: Unsupervised Word Segmentation for Neural Machine Translation and Text Generation, how to add it the configuration file and build the vocabulary ?
The next thing being how to translate the test data ? should this test data also be translated as subwords and how to restore this segmentation after translation ? (
sed -r 's/(@@ )|(@@ ?$)//g')
Please confirm this bpe step