It would be nice to have something like the XML feature in Moses to specify required translations for parts of the source sentence – this way we could combine fuzzy matches from a TM and use NMT to only translate the unmatched part, as in Koehn & Senellart (2010 AMTA)
I’ve been experimenting with sending “pretranslations” through NMT. That seems to work OK and they generally pass through “unscathed” IF they are untagged. I’ve found that if you tag them NMT starts to do strange things. I guess we would need to include some tagged sentences in the training material as others have mentioned.
Hi @vincent, this will soon be possible with lexical beam search implementation that is coming in 0.9. It is not exactly equivalent to getting the Moses XML tags because you can not force specific part of the sentence to be translated in some strict way (since there is no strict source-target alignment), but it should work pretty well in most cases
Thanks for the interesting article. Any idea on when this will be available – what will be the approx. release date of 0.9?
just done !
Any plans to include pointer networks? If you could somehow mark this on the input string, as in Moses, this seems like a way to copy stuff from the input straight to the output.
I found that workaround of mine only worked for me in v0.7. In v0.9 it drives the model crazy and produces unhelpful output. I’ve had much more success with specializing models via retraining/incremental training.