Ctranslate2 - evaluation attention

SamuelLacombe · July 24, 2022, 10:31pm

Hello,

I have a use case where I would need the option “return_attention” from the function “score_batch”, I’m pretty sure in the backend of “score_batch”, “batch_translation” is called? If so, how hard would it be to have the option of the attentions?

My use case is this:

I will train a model with just my data
Translate my text with a good translator such as DeepL (doesn’t offer custom training)
Use the score_batch to check if there are any tokens which my model is sure (with the predict score) it’s “wrong”
If the attention would be present, I could grab the source tokens and translate it with my model and then replace the token/word in the original DeepL Translation.

DeepL is just an example… but I guess you figure out the concept.

Best regards,
Samuel

guillaumekln · July 25, 2022, 8:19am

Hi,

“score_batch” does not run a batch translation, but we can consider returning attention vectors in a future version.

However, you should remember that Transformer attention may not work as alignment by default:

SamuelLacombe · July 25, 2022, 3:51pm

Thank you Guillaume, I’m aware that attentions are not intended for alignment and can be completly off. I’m actually experiencing with it to find the best way to use the attentions to create a really reliable alignment. I believe taking the highest is not enough. I’m trying a combination of horizontally and vertically at the same time.