Hi,
I’m using the 2.0 version of OpenNMT-py.Now, when I try to run onmt_build_vocab I get the below error:
ModuleNotFoundError: No module named ‘subword_nmt’.
It would be helpful if anyone of you can help me figure this out. Also, how to mention the number of merge operations while using OpenNMT-py
Hi there,
Looks like you want to use the "bpe"
transform. You should install the subword-nmt module for this.
This may not be very clear in the docs indeed.
Also, how to mention the number of merge operations while using OpenNMT-py
You need to prepare your BPE model prior to using OpenNMT-py. Either with subword-nmt
or the OpenNMT Tokenizer for instance.
@francoishernandez so you are saying I need to apply BPE prior to using OpenNMT-py and also “bpe” transform… what does it do then?.
Thanks
Please do your research. There are two aspects in the BPE method.
- BPE merge operations need to be learned.
- The “bpe” transform will then apply these operations on your data.
Some useful refs: