Out of memory when running learn_bpe.lua

dbenito · July 6, 2017, 2:55am

When I try running learn_bpe.lua on a set of 4.8M sentences, I consistently get one of the following errors:

/home/dbenito/src/torch/install/bin/luajit: not enough memory

or

PANIC: unprotected error in call to Lua API (not enough memory)

Using learn_bpe.py works fine, but I’m interested in testing the ‘both’ (prefix + suffix) for bpe_mode. Has anybody run into the same issue?

emartinezVic · July 6, 2017, 7:09am

It seems like the problem is related with your lua instalation.
Do you use luajit or lua52?
Can you try to reinstall it with lua52? Maybe that can solve your problem

dbenito · July 6, 2017, 10:22pm

Eva,

Thanks! Reinstalling Torch using Lua 5.2 instead of LuaJIT appears to have solved the issue. It might be worth including that in the installation instructions, since they currently simply refer people to the standard Torch installation guide, which defaults to using LuaJIT.

That said, everything I’ve ready about LuaJIT indicates that it is considerably faster than the normal Lua interpreter. Will switching to Lua 5.2 make everything slower? I’m guessing that since the brunt of the work is done on the GPU, it won’t really matter much…

dbenito · July 7, 2017, 2:02am

With Lua 5.2 instead of LuaJIT, learn_bpe.lua is painfully slow. I’m going to see if I can change it to use tds.Vec instead of a built-in table, since that should work around the 2GB limit in LuaJIT. If I get it to work, I’ll send a PR; if not, I’ll just open an issue

guillaumekln · July 7, 2017, 7:12am

This is actually covered in the dedicated documentation:

Use Lua 5.2 if you encounter any memory issue while using learn_bpe.lua (e.g. -size is too big). Otherwise, stay with Lua 5.1 for better efficiency.

http://opennmt.net/OpenNMT/tools/tokenization/#bpe

This is correct. It’s mostly the plain Lua scripts that will be slower. However, you can easily install both Lua versions in parallel and switch as needed.

That would be great.

dbenito · July 7, 2017, 1:16pm

In general, do you recommend using Lua 5.1 rather than LuaJIT? If so, then that should really be clearly stated in the Installation section of the docs (which should also be updated to reflect that luarocks install bit32 is also required).

guillaumekln · July 7, 2017, 1:20pm

Sorry for the confusion, by Lua 5.1 we actually meant LuaJIT (which is based on Lua 5.1).

Also, the documentation covers the bit32 package:

For LuaJIT users, tokenization tools require the bit32 package.

http://opennmt.net/OpenNMT/tools/tokenization/

dbenito · July 7, 2017, 2:49pm

Switching from using tables to tds.Hash and tds.Vec solves the 2GB issue in LuaJIT, but the resulting code is just as slow as with Lua 5.2 - roughly an order of magnitude slower than LuaJIT (when it can fit everything in under 2GB).

It seems as though iterating through tds.Hash (either with pairs and ipairs) is not particularly fast. Any ideas about how to make this faster?

guillaumekln · July 7, 2017, 9:01pm

Interesting. Do you have a branch on GitHub with these changes?

dbenito · July 8, 2017, 4:29pm

Not yet - I’ve been playing around with the code a lot to try to speed it up, and it’s currently quite messy (and not quite functional). I’ll clean it up and push it to a branch on my GitHub fork, and will let you know once it’s available.

Update: I think I may have found some middle ground, changing only some tables to either tds.Vec or tds.Hash (depending on how they’re addressed). For smaller corpora where the original learn_bpe.lua worked OK under LuaJIT, my updated version is only marginally slower, and for larger corpora where LuaJIT crashed, it seems to work OK now.

The key was to keep stats as a table, since the code iterates through it over and over (and iterating over a tds.Hash appears to be very slow).

I’m going to run some more tests and, if all goes well, will push the updated version to a branch on my fork of the repo and will submit a PR.

dbenito · July 9, 2017, 8:24am

Done! I’ve tested with various different sets of input data and sizes, and confirmed that it always produces the same results as the old version under Lua 5.1 - except that it runs about 10 times faster (and pretty much as fast as the old version under LuaJIT). I’ve submitted a pull request too

jean.senellart · July 11, 2017, 6:42am

Thanks Daniel for the PR - the difference of speed is amazing and worth investigating further.