I am currently working with a set of morphologically rich languages each having a vocabulary size of about 2,50,000 to 3,00,00. I have been constantly getting the warning similar to
“UserWarning: Converting sparse IndexedSlices to a dense Tensor with 146618880 elements. This may consume a large amount of memory.”
When I try to truncate my vocabulary size to 50-60% of the size including only the most frequent words, I have a problem of too many unk tokens in my resultant output.
How do I deal with this?
Are there any options to make the onmt training more memory efficient?