Hi there,
I’m in the process of creating translators from English and Hebrew to Yiddish. Would it be better to create two separate models (EN-YI, HE-YI) or one combined model?
Yiddish uses the Hebrew alphabet, and up to 20% of Yiddish words have their roots in Hebrew. On the other hand, Yiddish is fundamentally a Germanic language, and its sentence structure and most of its vocabulary are much closer to English than to Hebrew. That’s why I thought that combining the two would have a “whole is greater than its parts” effect. Does that make sense?
Assuming I go the combined model route, is there anything special I need to do in the corpus? Can I just combine the parallel corpus for both languages into one, given that the source languages use different alphabets (so no room for confusion)?
Thank you very much!