There were discussions to improve the
-gpuid option as it can be confusing, especially when using multiple GPUs. See for example:
What policy should we adopt?
My opinion is to convert this option to a boolean flag: simply
-gpu for example.
- When you have a single GPU (the most frequent use case),
-gpuid already acts as a boolean.
- When you have multiple GPUs, you generally want to use
CUDA_VISIBLE_DEVICES to not allocate memory on every GPUs. Then you usually set
-gpuid 1 and it also acts as a boolean.
Should we go this way? Obviously it is a breaking change but we are still allowing that per semantic versioning.
Hmm, I’m not sure I like this.
Do we need to break -gpuid? -gpuid allocates a very small amount of memory on the other GPUs. My group has been fine using -gpuid 2 etc. I know CUDA_VISIBLE_DEVICES is better, but do we need to break the current option?
Or should we drop the use of
CUDA_VISIBLE_DEVICES and support a list of comma-separated identifiers in
-gpuid as I proposed in the issue?
That being said, not using
CUDA_VISIBLE_DEVICES also spams the output of
nvidia-smi on multi GPU servers. Not so nice…
@jean.senellart What do you think?
the amount is not so small - it is about 250Mo which means that with 8 processes on 8 GPU we are wasting 2Gb memory which is too high.
I was about to say that we can hide
CUDA_VISIBLE_DEVICE by using
torch.setenv before loading
cutorch but there is something new: probably connected to the automatic use of
THC_ALLOCATOR in torch - there is no more default memory crunch on recent versions of Torch. We have a first 297M use only on the GPU we use, and only at the first mem allocation - @guillaumekln, can you double check?
In that case, let us drop completely
CUDA_VISIBLE_DEVICES, but also
-nparallel and we can extend
-gpuid syntax to
-gpuid ID1,ID2,ID..n. There is a little bit of work to do in Parallel.lua though.
Indeed, recent Torch versions no more allocate memory on every GPUs, with or without
THC_CACHING_ALLOCATOR enabled. Here was my simple test:
local t = torch.Tensor(1):cuda()
So this is a good news and we can now support a list of comma-separated identifiers with