OpenNMT Forum

Word Suggestions and Lexical Constraints

Hello!

If I translate the following sentence in DeepL from French to English…

La crise liée à la COVID-19 a creusé les inégalités préexistantes.

… I get the following translation:

The VIDOC-19 crisis has deepened pre-existing inequalities.

1- If I click the proposed word “VIDOC-19”, I can get other suggestions like “COVID-19”.
2- If I change the word “pre-existing” to say “already”, it changes the next part of the translation accordingly.

I understand that “1” can be done by word alignment and “2” can be done by something like lexical constraints. My question: is it possible to apply 1 and 2 with OpenNMT (either py or tf) without changing the code?

Thanks and regards,
Yasmin

1 Like

Hi,

If you can use CTranslate2 (i.e. you trained Transformer models), both features are already implemented:

  1. Alternatives at position
  2. Autocompletion
1 Like

Many thanks, Guillaume! This is very interesting. Have you written papers about these features?

GPU Memory Usage

  • OpenNMT-py onmt_translate: 5115MiB
  • CTranslate: 831MiB

Default CTranslate settings except Beam Search to be the same as onmt_translate.

Impressive! Thanks!

Yasmin

No.

The approach is fairly basic though:

For autocompletion, the target prefix is fed in teacher forcing mode into the decoder. Then we simply continue decoding from there. It’s the same as GPT constrained generation for example.

For the alternatives, we also feed the prefix in teacher forcing mode and expand the next N most likely tokens. Then we continue the decoding for these N hypotheses independently. This approach could be improved to give more/better alternatives.

1 Like

Thanks, Guillaume!

One option for improvement might be Diverse Beam Search.

Kind regards,
Yasmin

1 Like