Newbie here, trying to set up the default OpenNMT-py EN-DE on my pc.
I fixed several errores/problems on my own with some research and finally got to testing the files with my GPU (which was supposed to finish tomorrow by ~11:00 AM), but I can’t seem to fix this.
However, a few minutes ago I got several errors:
[2020-02-02 23:58:11,951 INFO] number of examples: 403
Traceback (most recent call last):File "C:\Users\?\OpenNMT-py\onmt\trainer.py", line 377, in _gradient_accumulation trunc_size=trunc_size)
File "C:\Users\?\OpenNMT-py\onmt\utils\loss.py", line 165, in __call__ for shard in shards(shard_state, shard_size):
File "C:\Users\?\OpenNMT-py\onmt\utils\loss.py", line 381, in shards torch.autograd.backward(inputs, grads)
File "C:\Users\?\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\autograd\__init__.py", line 99, in backward allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA error: unspecified launch failure
[2020-02-02 23:58:14,929 INFO] At step 32541, we removed a batch - accum 0
Traceback (most recent call last):
File "train.py", line 6, in <module> main()
File "C:\Users\?\OpenNMT-py\onmt\bin\train.py", line 204, in main train(opt)
File "C:\Users\?\OpenNMT-py\onmt\bin\train.py", line 88, in train single_main(opt, 0)
File "C:\Users\?\OpenNMT-py\onmt\train_single.py", line 143, in main valid_steps=opt.valid_steps)
File "C:\Users\?\OpenNMT-py\onmt\trainer.py", line 244, in train report_stats)
File "C:\Users\?\OpenNMT-py\onmt\trainer.py", line 399, in _gradient_accumulation self.optim.step()
File "C:\Users\?\OpenNMT-py\onmt\utils\optimizers.py", line 359, in step clip_grad_norm_(group['params'], self._max_grad_norm)
File "C:\Users\?\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\utils\clip_grad.py", line 32, in clip_grad_norm_ param_norm = p.grad.data.norm(norm_type)
File "C:\Users\?\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\tensor.py", line 339, in norm return torch.norm(self, p, dim, keepdim, dtype=dtype)
File "C:\Users\?\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\functional.py", line 747, in norm return torch._C._VariableFunctions.norm(input, p)
RuntimeError: CUDA error: unspecified launch failure
C:\Users\?\OpenNMT-py>.git
'.git' is not recognized as an internal or external command,
operable program or batch file.
I trained the model deleting some data from the files because it was too huge and I just wanted to test it with fewer sentences (to try and see if I got it to work and later on use other data with a EN-ES language pair). I trained it using:
onmt_train -data data/demo -save_model demo-model
I continued on to execute the translation and it worked. Obviously, the results are really bad but it did work.
Anyone can help? Thank you beforehand.