OpenNMT Forum

Dynamic dictionary in the preprocessing step

Hi,
I want to train a model with -copy_attn and -copy_attn_force.
Given this discussion, I should set shared_vocab and dynamic_dict to True.

Setting dynamic_dict to True is causing me problems: it throws a KeyError:

Traceback (most recent call last):
File “preprocess.py”, line 158, in
main(opt)
File “preprocess.py”, line 136, in main
‘train’, fields, src_reader, tgt_reader, opt)
File “preprocess.py”, line 63, in build_save_dataset
filter_pred=filter_pred
File “/zai/OpenNMT-APE/onmt/inputters/dataset_base.py”, line 126, in init
ex_dict, src_field.base_field, tgt_field.base_field)
File “/zai/OpenNMT-APE/onmt/inputters/dataset_base.py”, line 57, in _dynamic_dict
[unk_idx] + [src_ex_vocab.stoi[w] for w in tgt] + [unk_idx])
File “/zai/OpenNMT-APE/onmt/inputters/dataset_base.py”, line 57, in
[unk_idx] + [src_ex_vocab.stoi[w] for w in tgt] + [unk_idx])
KeyError: ‘va’

Does the dynamic_dict flag require any other files apart from train and dev? Any additional vocabulary files?

@pltrdy is this error familiar?

I think there’s a problem with torchtext here. Please upgrade it and try again. Vocab.stoi should not raise KeyError (it returns unk token on error)

Thank you. I am using torchtext 0.4.0 and pytorch 1.0.0. This is because other dependencies in the project require these versions.

Is there an alternative to upgrading?

Then you should open an issue on their repo.