An error was encountered while running the pre training model

I am using the pretrained En-De model with the following command,

python translate.py -model wmt_ende_sp/model/averaged-10-epoch.pt -src data/test.en -tgt data/test.de -verbose

and I got the following error,

Traceback (most recent call last):
  File "D:\anaconda\envs\opennmt\Scripts\onmt_translate-script.py", line 33, in <module>
    sys.exit(load_entry_point('OpenNMT-py', 'console_scripts', 'onmt_translate')())
  File "f:\desktop\opennmt-py\opennmt-py-master\onmt\bin\translate.py", line 60, in main
    translate(opt)
  File "f:\desktop\opennmt-py\opennmt-py-master\onmt\bin\translate.py", line 23, in translate
    translator = build_translator(opt, logger=logger,
  File "f:\desktop\opennmt-py\opennmt-py-master\onmt\translate\translator.py", line 31, in build_translator
    vocabs, model, model_opt = load_test_model(opt)
  File "f:\desktop\opennmt-py\opennmt-py-master\onmt\model_builder.py", line 90, in load_test_model
    checkpoint = torch.load(model_path,
  File "D:\anaconda\envs\opennmt\lib\site-packages\torch\serialization.py", line 712, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "D:\anaconda\envs\opennmt\lib\site-packages\torch\serialization.py", line 1049, in _load
    result = unpickler.load()
  File "D:\anaconda\envs\opennmt\lib\site-packages\torch\serialization.py", line 1042, in find_class
    return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'onmt.inputters.text_dataset'

In addition, I want to know how to use the SentencePiece model. I won’t use it.
I hope you can help me.

You probably need to convert the checkpoint to OpenNMT-py v3, as I don’t think that tutorial’s checkpoints have been updated/converted.

There’s a script in the OpenNMT-py Repo which can do that, or you can downgrade the version of OpenNMT-py to v2 if it’s too much grief.

1 Like

This is the right answer.
Also, I will release a new EN-DE model in the early days of 2023.

Thank you very much, I will try

I just released a new v3 model.

Scores are way higher than the paper replication because it was trained on more data.

However no back translation has been used for this model.

You need to use the pyonmttok package with the BPE model provided on the same page.

https://opennmt.net/Models-py/

1 Like

I installed v3.0 on kaggle and it worked fine for training. But when I want to translate it using a resulting model using onmt_translate I get ModuleNotFoundError: No module named 'onmt.inputters.text_dataset'

I suppose the mentioned script is tools/convertv2_v3.py, but when I run this is in kaggle I get an indication that the path is wrong. I tried wgetting it from the github, which worked, but then I run into No module named 'pyonmttok' so I guess that’s not the solution.

It still puzzles me also that I created my models with version 3.0 and want to translate them with the same version and that it doesn’t work. So I must be doing something wrong. Can anyone help? Thanks.

I am not sure what you are trying to do.

this message relates to v2.

if you use v3 for both training and inference, you should not have this message.

are you trying to infer an old model ?

That is also what I don’t understand. I will rerun everything to make sure.

why don’t you use @ymoslem tutorial ?

Yes thanks, I started from that tutorial and also from Learning Resources - DataLitMT, but wanted to run things in kaggle as colab kicked me out from the gpu temporarily.

Rerun works fine – I must have mistakenly used old models. Thanks for the quick response!