I have a use case where I would need the option “return_attention” from the function “score_batch”, I’m pretty sure in the backend of “score_batch”, “batch_translation” is called? If so, how hard would it be to have the option of the attentions?
My use case is this:
I will train a model with just my data
Translate my text with a good translator such as DeepL (doesn’t offer custom training)
Use the score_batch to check if there are any tokens which my model is sure (with the predict score) it’s “wrong”
If the attention would be present, I could grab the source tokens and translate it with my model and then replace the token/word in the original DeepL Translation.
DeepL is just an example… but I guess you figure out the concept.
Thank you Guillaume, I’m aware that attentions are not intended for alignment and can be completly off. I’m actually experiencing with it to find the best way to use the attentions to create a really reliable alignment. I believe taking the highest is not enough. I’m trying a combination of horizontally and vertically at the same time.