in general, how should I use bpe to train a transformer?
in my opinion, I can first learn bpe model using other tool, and then tokenize the training data using bpe model. and then use onmt-build-vocab to build vocab on my own data set. and then use this vocab to with onmt-main to train model?
is there something wrong with my method?
or do you have examples of how to combine bpe to customer dataset?