Incorporating translation dictionary during decoding

raymondhs · January 29, 2018, 9:03am

Hi all, is there any way in OpenNMT to incorporate translation dictionary during decoding? This is useful e.g. for handling entity names. If the input tokens can be found in this dictionary, then their corresponding translation will be used definitely. I’m aware of the “-phrase_table” option in OpenNMT, but I thought it is applicable in OOV case only. In Moses, we can handle this by using XML markup, for reference: http://www.statmt.org/moses/?n=Advanced.Hybrid#ntoc1. Just wondering if OpenNMT has support for similar features.

Thanks!

jean.senellart · January 30, 2018, 12:41pm

Hi @raymondhs, getting the exact equivalent of Moses XML tag is hard for many reasons but there is multiple ways to do something equivalent with NMT. I will post a tutorial this week on this topic on the forum. Stay tuned!

tel34 · January 30, 2018, 8:57pm

Hi Raymond, We apply a custom dictionary to the source text before it is piped into the OpenNMT server. An example would be when we want to force, say the Dutch word “verbinding”, to be translated as “compound” rather than “connection”. This seems to work even if the forced translation is OOV in the source & target, and even works if the custom dictionary inserts a made-up verb form where an English sentence needs to be reordered, e.g. ik heb het systeem geboggld -> I have boggled the system. This approach works before rather than during decoding but it does allow us to “force” the translation of specialist terms which are out of domain in terms of our model.

raymondhs · January 31, 2018, 1:48am

Thanks, look forward to the tutorial!

raymondhs · January 31, 2018, 1:56am

What exactly does it mean by “apply a custom dictionary to the source text”? Does it involve retraining the model?

tel34 · January 31, 2018, 8:56am

No, it doesn’t. We created middleware to do various pre-processing and post-processing on NMT input and output. The process is:
client -> middleware (pre-processing input, inc. custom dictionary) -> ONMT server -> middleware (post-processing output) - > client.
The model is not retrained. I am quite surprised that these OOV’s (in relation to the model) are handled correctly (including being re-ordered) but the fact is they are.
Terence

raymondhs · January 31, 2018, 10:32am

Hi Terence, thanks for the explanation. From my understanding, is it true that the input into OpenNMT server then may contain tokens from the target-side language, due to the application of this custom dictionary beforehand? Yes, it is interesting if the model still works well even after this preprocessing (since the model was not trained in this way).

tel34 · January 31, 2018, 12:03pm

Yes, that is the case. The predictions mostly contain well-formed English sentences. I have not tried this with other language pairs. My aim was purely to use OpenNMT in a workflow where the correctness of the translations of technical terms was paramount.

raymondhs · February 12, 2018, 6:56am

Just wondering if the tutorial on this will come out soon?

jean.senellart · February 12, 2018, 8:29pm

yes… sorry still on my todo list and I have a first draft, but a bit behind everything with the workshop preparation.

oraveczcsaba · September 14, 2018, 8:05am

Hi,
Is there any news about this tutorial? I just came across this issue and I checked but don’t seem to find it under the ‘Tutorials’ category.

tel34 · April 15, 2019, 10:36am

Just revisiting this topic. I now find that this approach of forcing translations with a custom dictionary applied to the input works ok with my TensorFlow transformer model. Perhaps this is because I am now using a combined vocabulary with SentencePiece. So, if I want “installatie” to be translated as “system” and not as “installation” I put installatie=system in the custom dictionary and the model has no problem with that.

ajitesh3 · August 28, 2019, 7:09am

hi @tel34 @guillaumekln @vince62s
Could you tell me how to implement this custom dictionary thing in Openmt-py and why should I have combined vocabulary with sentencepiece

tel34 · August 29, 2019, 8:57am

Hi @ajitesh3. I’m afraid I went from the Lua based OpenNMT to OpenNMT-tf and have not worked with OpenNMT-py. There are many reasons for combining vocabulary and you will need to do some research here. Apart from anything else you get a smaller vocabulary footprint as many of the subwords from source & target will correspond.

ajitesh3 · August 29, 2019, 9:48am

@tel34 Thanks for your response
I have the language pair which do not share vocabulary. So in my case I have separate vocabulary.
Could you help me with handling named entity from english to hindi translation.

ajitesh3 · August 29, 2019, 12:49pm

@jean.senellart still waiting for tutorial in OpenNMT-py