would be nice to have multi-threading option in tokenize.lua
and detokenize.lua
for faster large document tokenization
done in tokenization with the option:
-
-nparallel
: Number of parallel thread to run the tokenization
with 3 parallel worker on my laptop - speed-up is up to 2
same available in detokenization - speed-up small since detokenization is not very CPU intensive