A tool like Moses brings several interesting things:
- a large vocab
- the ability to do some kinds of contextual disambiguations on words or groups of words
- the ability to bring the sub-sentence alignment of its translation
Here is a FR COOKING sentence:
Les pâtisseries orientales nous font voyager en quelques bouchées, au Maghreb et Moyen-Orient.
For the demonstration, here is the ONMT translation obtained with a off-domain COMPUTING model:
Orientales pâtisseries are making us travel in a few bouchées , in the Maghreb and in the Middle East .
Here is the Moses translation obtained with the same kind of off-domain COMPUTING model:
Pastries eastern us travel in a few jammed, the Maghreb and the Middle East.
Here is the alignment provided by Moses, where I only kept the longer word for each group of words (of course, it’s a naive test, it would have been much more efficient to properly remove empty words and poncts):
AL : les pâtisseries => pastries / pâtisseries => pastries
AL : orientales => eastern / orientales => eastern
AL : nous font => us / nous => us
AL : voyager => travel / voyager => travel
AL : en quelques => in a few / quelques => few
AL : bouchées => jammed / bouchées => jammed
AL : , au Maghreb et => , the Maghreb and / Maghreb => Maghreb
AL : Moyen-Orient . => the Middle East . / Moyen-Orient => Middle
This provides me with these possible finalizations of the ONMT translation (replacement of untranslated words):
Finalize : Orientales => eastern
Finalize : pâtisseries => pastries
Finalize : bouchées => jammed
Finalize : Maghreb => Maghreb
I finally get this NMT/SMT mixed translation:
Eastern pastries are making us travel in a few jammed, in the Maghreb and in the Middle East.
PS : COMPUTING models were obtained by mixing in-domain sentences with 2M of Europarl.
PS : of course, the final result is still not perfect.