Tokenize.lua needs tools.utils.unicode, can't find it

Hi all !

I just succeeded in installing OpenNMT, and run the small Guide test.

Now, I want to try with my own data.

tokenize.lua failed with this error :

/home/lm-dev8/torch/install/bin/luajit: /home/lm-dev8/torch/install/share/lua/5.1/trepl/init.lua:389: module ‘tools.utils.unicode’ not found:No LuaRocks module found for tools.utils.unicode
no field package.preload[‘tools.utils.unicode’]
no file ‘/home/lm-dev8/.luarocks/share/lua/5.1/tools/utils/unicode.lua’

I tried Lua51 and LuaJIT.

I tried to install few packages with ‘luarocks install’, but didn’t find the right one.

Searching Google, or this forum, with some keywords didn’t bring me a solution (or I don’t know how to ask).

What should I install ?

Thanks for help.

Best regards,



You should invoke the tokenization script from the OpenNMT main directory like this:

th tools/tokenize.lua OPTIONS < file > file.tok

We don’t support invoking scripts from an arbitrary directory yet.

OK !

Many thanks !


I needed to do ‘luarocks install bit32’ to get tokenizer.lua working.

