OpenNMT Forum

Getting the vocabulary after preprocessing

How does one get the vocabulary of a tensor after preprocessing with I get files for train and validation datasets and a vocab file. This vocab file is a dict of fields and TextMultiField. I cannot seem to find any vocabulary mapping in these files.

Hi Rajashan,

It is covered here

But I don’t get torchtext.vocab.Vocab objects, I get onmt.inputters.text_dataset.TextMultiField objects, which don’t seem to be vocabs?

May be there are more experienced forum members who can help you with it. This approach always worked fine for me when I was opening ‘’ file this way.