In some specific domain, the some vocabulary are prone to translation errors, even if the domain corpus is trained, how to set weight for specify vocabulary to improve quality of translation?
Suggestion : duplicate N copies of the sentences with the specific vocab in the training set.
How many copies of those sentences would be needed in your view to override the translations contained in a baseline?
The ideal would be to make some tests, and see how it evolves. To get more weight, in front of other sentences, tries something like N between 5 and 10. Then, make a BLEU evaluation on both the whole sentences, and the specific vocab ones. Compare it with the previous values.
Suggestion 2:
thank you very much, it is very useful for me.
1 Like