OpenNMT Forum

‘charmap’ codec can’t encode character ‘\ufffd’ in position 0: character maps to <undefined>

Here is my .yml file :

toy_en_de.yaml
Where the samples will be written
save_data: run/example

Where the vocab(s) will be written
src_vocab: run/example.vocab.src
tgt_vocab: run/example.vocab.tgt

Prevent overwriting existing files in the folder
overwrite: False

Corpus opts:
data:
corpus_1:
path_src: src-train.txt
path_tgt: tgt-train.txt
valid:
path_src: src-val.txt
path_tgt: tgt-val.txt

my directory is : C:\Users\Dhar_7\OpenNMT-py\toy-ende>

i ran the following code : onmt_build_vocab -config toy_en_de.yml -n_sample 10000

got the following error : UnicodeEncodeError: ‘charmap’ codec can’t encode character ‘\ufffd’ in position 0: character maps to

Note: I created the YAML file in notepad++ and set encoding UTF-8

Duplicate of: