OpenNMT Forum

Onmt_preprocess: command not found

I get an error when I run the following commands in order on colab
!git clone https://github.com/OpenNMT/OpenNMT-py.git
cd OpenNMT-py
!python setup.py install
! pip install -r requirements.opt.txt
!wget -O data/im2text.tgz http://lstm.seas.harvard.edu/latex/im2text_small.tgz; tar zxf data/im2text.tgz -C data/
!onmt_preprocess -data_type img
-src_dir data/im2text/images/
-train_src data/im2text/src-train.txt
-train_tgt data/im2text/tgt-train.txt -valid_src data/im2text/src-val.txt
-valid_tgt data/im2text/tgt-val.txt -save_data data/im2text/demo
-tgt_seq_length 150
-tgt_words_min_frequency 2
-shard_size 500
-image_channel_size 1
/bin/bash: onmt_preprocess: command not found
I really need your help!this problem has bothered me for a long time!

You need to use v1.2.0

Thanks for your guidance, I have solved the previous problem, but when I run
!onmt_train -model_type img **
** -data data/im2text/demo **
** -save_model demo-model **
** -gpu_ranks 0 **
** -batch_size 20 **
** -max_grad_norm 20 **
** -learning_rate 0.1 **
** -word_vec_size 80 **
** -encoder_type brnn **
** -image_channel_size 1

I ran into this problem:
“File “/usr/local/lib/python3.7/dist-packages/OpenNMT_py-1.2.0-py3.7.egg/onmt/inputters/inputter.py”, line 197, in patch_fields
maybe_cid_field = dvocab.get(‘corpus_id’, None)
AttributeError: ‘list’ object has no attribute ‘get’”
I am a novice, thank you so much for guiding me so patiently!

I think
"dvocab = torch.load(opt.data + ‘.vocab.pt’)"Should return in dict format
and I do not know what is the specific meaning of“corpus_id”

corpus_id is a field added to track from which corpus comes every example.
The issue here is probably with your vocab not being a dict indeed. Maybe you could try and load it manually in a python shell to inspect it.

Can you tell me what is the correct format of the dict: “dvocab”,?
maybe I can generate it manually
now Its form is“[(‘tgt’, <torchtext.vocab.Vocab object at 0x7fa7a9d9e4d0>)]”

Looks like you have some sort of mismatch somewhere.
I just ran the demo commands from here in a clean virtualenv (with OpenNMT-py==1.2.0 freshly installed) and it all went fine.