Extracting and visualizing the decoder attention weights


(zoulikha) #1

Is there any way to extract and visualize the attention weights for a given parallel sentence in the seq2seq learning framework. As shown in the Figure below each pixel (gray density value between 0 and 1) represents the weight (i,j) of the source token having the index i and the target token having index j.

Bahdanau et al. 2015 “Neural machine translation by jointly learning to align and translate”


Multiple tokens in Source to single token in Target
(jean.senellart) #2

If you are using translate.lua - the attention vector for each sentence is in the variable results[b].preds[n].attention (b being the id in the batch, and n id in the n-best list). It is a T x S tensor - where S is the source sentence length, and T the length of this specific translation.


(Yuan-Lu Chen) #3

I think this is what you’re looking for:
NMT Attention Alignment Visualizations

You may run translate.lua to translate with the -save_attention parameter to save attentions to a file.

The visualization program above takes the attention file as an input, and then it nicely prints out the visualization of word alignments with weights for you.


(Anderleich) #4

Hi,
I was wondering what the values obtained by saving the attention mean. I’ve used the -save_attention parameter to save the values to a file. What I obtained is something like this:

1 ||| source sentence ||| score ||| target sentence tokenized ||| number number
Matrix like numbers

What does the score mean?
What do the last two numbers mean?

Thank you in advance


(Guillaume Klein) #5

Hi,

  • score: the cumulated log likelihood of the sentence.
  • two numbers: source length and target length