I am running OpenNMT-py. I have my batch_size set to 2000. It runs just fine during the training step. However, during the validation step, it gives RuntimeError: “CUDA out of memory” for valid_batch_size of 2000. It only works, if I reduce the valid_batch_size down to 30. Why is the validation step running out of memory even for valid_batch_size of ~50? Is it holding onto the batch from training during the validation step? If so, why?
Any suggestion would be much appreciated.