OpenNMT Forum

Illegal memory access error while validation

Hi Im trying to run onmt_train command on GPU.
onmt_train -data data/demo -save_model demo-model --train_steps 10000 --valid_steps 1000.

It runs fine while training. but while validation step it is throwing below error. Please help me solve it. Thanks.

THCudaCheck FAIL file=/pytorch/aten/src/THC/THCReduceAll.cuh line=327 error=700 : an illegal memory access was encountered
Traceback (most recent call last):
File “/var/nvidia/soumya/anaconda3/envs/openNMT/bin/onmt_train”, line 8, in
sys.exit(main())
File “/var/nvidia/soumya/anaconda3/envs/openNMT/lib/python3.7/site-packages/onmt/bin/train.py”, line 206, in main
train(opt)
File “/var/nvidia/soumya/anaconda3/envs/openNMT/lib/python3.7/site-packages/onmt/bin/train.py”, line 88, in train
single_main(opt, 0)
File “/var/nvidia/soumya/anaconda3/envs/openNMT/lib/python3.7/site-packages/onmt/train_single.py”, line 143, in main
valid_steps=opt.valid_steps)
File “/var/nvidia/soumya/anaconda3/envs/openNMT/lib/python3.7/site-packages/onmt/trainer.py”, line 280, in train
valid_iter, moving_average=self.moving_average)
File “/var/nvidia/soumya/anaconda3/envs/openNMT/lib/python3.7/site-packages/onmt/trainer.py”, line 342, in validate
_, batch_stats = self.valid_loss(batch, outputs, attns)
File “/var/nvidia/soumya/anaconda3/envs/openNMT/lib/python3.7/site-packages/onmt/utils/loss.py”, line 162, in call
loss, stats = self._compute_loss(batch, **shard_state)
File “/var/nvidia/soumya/anaconda3/envs/openNMT/lib/python3.7/site-packages/onmt/utils/loss.py”, line 299, in _compute_loss
stats = self._stats(loss.clone(), scores, gtruth)
File “/var/nvidia/soumya/anaconda3/envs/openNMT/lib/python3.7/site-packages/onmt/utils/loss.py”, line 183, in _stats
num_correct = pred.eq(target).masked_select(non_padding).sum().item()
RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCReduceAll.cuh:327

Can anyone suggest on this ? thanks

What PyTorch version are you using?