When I using OpenNMT-tf do Chinese to Chinese task, it generated lots of lable unk , the performance is poor, how to deal with it?, is there unk_replace option?
What type of model are you training?
If you are specifically looking for copy mechanism, it is implemented in OpenNMT-py under the -copy_attn
flag.
1 Like
Hello ,
Thanks first
The model i used is nmt_medium , and my project is based on tensorflow
Oh, before your edit it was not clear that the issue concerns unknown words. Do you use some kind of subword tokenization, like SentencePiece or BPE?
The tokenization was original, i only add a word2vec pre-trained embedding (Chinese)
Hi, could you tell me when the copy mechanism will be introduced into opennmt-tf? Thank you very much~
Hello,
I don’t know anyone working on this or willing to work on this so I can’t give any ETA for now.