Here is my .yml file :
toy_en_de.yaml
Where the samples will be written
save_data: run/example
Where the vocab(s) will be written
src_vocab: run/example.vocab.src
tgt_vocab: run/example.vocab.tgt
Prevent overwriting existing files in the folder
overwrite: False
Corpus opts:
data:
corpus_1:
path_src: src-train.txt
path_tgt: tgt-train.txt
valid:
path_src: src-val.txt
path_tgt: tgt-val.txt
…
my directory is : C:\Users\Dhar_7\OpenNMT-py\toy-ende>
i ran the following code : onmt_build_vocab -config toy_en_de.yml -n_sample 10000
got the following error : UnicodeEncodeError: ‘charmap’ codec can’t encode character ‘\ufffd’ in position 0: character maps to