I have a problem with translation of dictionaries(single-words in general). In the middle of the training(20-30k steps) model translate them corretly, but at the end I’ve got bad results(100k steps).
I’ve heard that it can be helpful to wrap dictionaries or just single-words by tags, or another type of noise. To show our model, that in any noisy context this words should be translated in certain way(I think, that can be helpful for idioms, proverbs). But now I can’t find any researches about that.
Maybe you have some ideas? Or maybe you know how to solve this?