Limiting number of CPU cores / CPU usage in training

nvr-rug · June 8, 2017, 9:31am

In our project, the models we use should be (re)trainable on CPUs. They have to run on a server platform consisting of 64 CPUs. However, this server is also used by a lot of other members of the project for different tasks, so we can’t just train models on 48 CPUs, for example.

The problem I have is when I train a model, it always uses up to 40-50 CPUs, regardless of batch size. Even with batch size 1, my CPU % usage was up to 4800. Is there a way to limit this within OpenNMT?

I looked into other (Linux) options (cpulimit, cset) but they do not work or are not exactly what I want. Optimally, I would want some kind of flag that limits CPU usage to X cores. Any help is greatly appreciated.

guillaumekln · June 8, 2017, 9:40am

OMP_NUM_THREADS=4 th train.lua [...]

nvr-rug · June 8, 2017, 9:46am

Thanks for your super quick reply. This solved the problem!