OpenNMT Forum

Data Preparation toy_en_de.yaml

I am following the quickstart tutorial on google colab, but when i am trying to make vocab out of the data using yaml config file it is showing below error:
Corpus corpus_1’s weight should be given. We default it to 1 for you.
Traceback (most recent call last):
File “/usr/local/bin/onmt_build_vocab”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/onmt/bin/build_vocab.py”, line 63, in main
build_vocab_main(opts)
File “/usr/local/lib/python3.6/dist-packages/onmt/bin/build_vocab.py”, line 23, in build_vocab_main
ArgumentParser.validate_prepare_opts(opts, build_vocab_only=True)
File “/usr/local/lib/python3.6/dist-packages/onmt/utils/parse.py”, line 130, in validate_prepare_opts
cls._validate_vocab_opts(opt, build_vocab_only=build_vocab_only)
File “/usr/local/lib/python3.6/dist-packages/onmt/utils/parse.py”, line 98, in _validate_vocab_opts
cls._validate_file(opt.src_vocab, info=‘src vocab’)
File “/usr/local/lib/python3.6/dist-packages/onmt/utils/parse.py”, line 18, in _validate_file
raise IOError(f"Please check path of your {info} file!")
OSError: Please check path of your src vocab file!

How can I fix this

There probably is an issue with some path. Hard to say more without more details.
Anyways I just tested and it seems ok: https://colab.research.google.com/drive/1pdEK_LufBQChnFq-EX27wVC7rmgeuHf2?usp=sharing

You must use windows.you should create a “run” folder, and create two blank files in run, namely example.vocab.src and example.vocab.tgt