onmt-build-vocab --size 50000 in OpenNMT-tf produces a vocabulary of size 50,000 including three additional tokens
</s> (50,000 lines in the output), while
onmt_preprocess --src_vocab_size 50000 in OpenNMT-py seems to produce a vocabulary of size 50,000 excluding additional tokens (
<blank> are added if you look at
If we want the same vocabulary in both versions, should we use
onmt-build-vocab --size 50003, instead of
onmt-build-vocab --size 50000 in OpenNMT-tf?