In our project, the models we use should be (re)trainable on CPUs. They have to run on a server platform consisting of 64 CPUs. However, this server is also used by a lot of other members of the project for different tasks, so we can’t just train models on 48 CPUs, for example.
The problem I have is when I train a model, it always uses up to 40-50 CPUs, regardless of batch size. Even with batch size 1, my CPU % usage was up to 4800. Is there a way to limit this within OpenNMT?
I looked into other (Linux) options (cpulimit, cset) but they do not work or are not exactly what I want. Optimally, I would want some kind of flag that limits CPU usage to X cores. Any help is greatly appreciated.