Extracting and visualizing the decoder attention weights

zoulikha · July 14, 2017, 10:42am

Is there any way to extract and visualize the attention weights for a given parallel sentence in the seq2seq learning framework. As shown in the Figure below each pixel (gray density value between 0 and 1) represents the weight (i,j) of the source token having the index i and the target token having index j.

Bahdanau et al. 2015 “Neural machine translation by jointly learning to align and translate”

jean.senellart · July 14, 2017, 3:02pm

If you are using translate.lua - the attention vector for each sentence is in the variable results[b].preds[n].attention (b being the id in the batch, and n id in the n-best list). It is a T x S tensor - where S is the source sentence length, and T the length of this specific translation.

lucien0410 · January 30, 2018, 9:40pm

I think this is what you’re looking for:
NMT Attention Alignment Visualizations

You may run translate.lua to translate with the -save_attention parameter to save attentions to a file.

The visualization program above takes the attention file as an input, and then it nicely prints out the visualization of word alignments with weights for you.

anderleich · October 3, 2018, 7:46am

Hi,
I was wondering what the values obtained by saving the attention mean. I’ve used the -save_attention parameter to save the values to a file. What I obtained is something like this:

1 ||| source sentence ||| score ||| target sentence tokenized ||| number number
Matrix like numbers

What does the score mean?
What do the last two numbers mean?
…

Thank you in advance

guillaumekln · October 3, 2018, 5:06pm

Hi,

score: the cumulated log likelihood of the sentence.
two numbers: source length and target length

github.com

OpenNMT/OpenNMT/blob/master/translate.lua#L283-L285


attFile:write(string.format('%d ||| %s ||| %f ||| %s ||| %d %d\n',
                            sentId, sentence, score, source,
                            sourceLength, targetLength))