OpenNMT Forum

In Tutorial, UnicodeEncodeError

Hi, Thanks to click.

I did OpenNMT-py in git-hub Tutorial, but I got a problem. In 1 Step, I commanded onmt_build_vocab -config toy_en_de.yaml -n_sample 10000 but It got some thing wrong in tgt. Error is
UnicodeEncodeError: ‘cp949’ codec can’t encode character ‘\ufffd’ in position 0: illegal multibyte sequence
How to fix up that?

I think there is an encoding mismatch between your data (probably utf-8) and the default encoding of your system (cp949 it seems).
There probably are some implicit operations that should be done explicitly in utf-8.
Can you post the whole trace to properly identify where this is triggered?
Thanks.

Thanks for your answer. I solved that error refered to your answer. And passed step1.
But I got some another problem in step 2.
The problem is
AttributeError: module ‘torch._C’ has no attribute ‘_cuda_setDevice’
I think this got problem in GPU.
How to solve this problem?

I don’t know much about using pytorch with cuda on windows. You can probably start by trying this a python shell:

import torch
torch.cuda.is_available()

and see what you get.