Incorporating translation dictionary during decoding

(Raymond) #1

Hi all, is there any way in OpenNMT to incorporate translation dictionary during decoding? This is useful e.g. for handling entity names. If the input tokens can be found in this dictionary, then their corresponding translation will be used definitely. I’m aware of the “-phrase_table” option in OpenNMT, but I thought it is applicable in OOV case only. In Moses, we can handle this by using XML markup, for reference: Just wondering if OpenNMT has support for similar features.


(jean.senellart) #2

Hi @raymondhs, getting the exact equivalent of Moses XML tag is hard for many reasons but there is multiple ways to do something equivalent with NMT. I will post a tutorial this week on this topic on the forum. Stay tuned!

(Terence Lewis) #3

Hi Raymond, We apply a custom dictionary to the source text before it is piped into the OpenNMT server. An example would be when we want to force, say the Dutch word “verbinding”, to be translated as “compound” rather than “connection”. This seems to work even if the forced translation is OOV in the source & target, and even works if the custom dictionary inserts a made-up verb form where an English sentence needs to be reordered, e.g. ik heb het systeem geboggld -> I have boggled the system. This approach works before rather than during decoding but it does allow us to “force” the translation of specialist terms which are out of domain in terms of our model.

(Raymond) #4

Thanks, look forward to the tutorial! :slight_smile:

(Raymond) #5

What exactly does it mean by “apply a custom dictionary to the source text”? Does it involve retraining the model?

(Terence Lewis) #6

No, it doesn’t. We created middleware to do various pre-processing and post-processing on NMT input and output. The process is:
client -> middleware (pre-processing input, inc. custom dictionary) -> ONMT server -> middleware (post-processing output) - > client.
The model is not retrained. I am quite surprised that these OOV’s (in relation to the model) are handled correctly (including being re-ordered) but the fact is they are.

(Raymond) #7

Hi Terence, thanks for the explanation. From my understanding, is it true that the input into OpenNMT server then may contain tokens from the target-side language, due to the application of this custom dictionary beforehand? Yes, it is interesting if the model still works well even after this preprocessing (since the model was not trained in this way).

(Terence Lewis) #8

Yes, that is the case. The predictions mostly contain well-formed English sentences. I have not tried this with other language pairs. My aim was purely to use OpenNMT in a workflow where the correctness of the translations of technical terms was paramount.

(Raymond) #9

Just wondering if the tutorial on this will come out soon? :slight_smile:

(jean.senellart) #10

yes… sorry still on my todo list and I have a first draft, but a bit behind everything with the workshop preparation.