OpenNMT Forum

Add multi-threading in tokenization

would be nice to have multi-threading option in tokenize.lua and detokenize.lua for faster large document tokenization

done in tokenization with the option:

  • -nparallel: Number of parallel thread to run the tokenization

with 3 parallel worker on my laptop - speed-up is up to 2

same available in detokenization - speed-up small since detokenization is not very CPU intensive