CTranslate2's target prefix

Hello,

My understanding is that this feature requires all the tokens from the beginning of the output up to the token we wish to change. What if we want to change a token at or near the end of the output? Then effectively we are providing the correct output and we are just asking the engine to reproduce it. Is there another mechanism or strategy so we can update a single or multiple tokens in any position?

Thanks!

Hi,

What do you have in mind?

The target prefix is a simple and effective way to know what tokens should be force decoded and where to start the unconstrained decoding.

Also I don’t see how you can change multiple positions in one request, because the first unconstrained token may completely change the rest of the translation.

Hi

On-the-fly translation improvement using the same model :).

Yes, of course, this makes perfect sense --I’ve just started looking into this feature so I’m still digesting the concept.

Basically, the idea is to implement some translation memory features to continuously improve the output.

I’m pretty sure DeepL is using the exact same approach to provide their autocompletion and alternative words features in the translation box.

Hi @guillaumekln
I wish to see how this target_prefix auto complete thing is implemented in OpenNMT latest version. Can you direct me to it

You can search for “prefix” in the decoding code:

@guillaumekln Arent we having this feature in OpenNMT-py itself

For OpenNMT-py see this PR:

ok
thanks @guillaumekln

Here is an implementation from MS India. I believe it is OpenNMT-py 1.x though, but I hope it can give you an idea.

Kind regards,
Yasmin

thanks @ymoslem
Looking into this