Is there any way to extract and visualize the attention weights for a given parallel sentence in the seq2seq learning framework. As shown in the Figure below each pixel (gray density value between 0 and 1) represents the weight (i,j) of the source token having the index i and the target token having index j.
Bahdanau et al. 2015 “Neural machine translation by jointly learning to align and translate”
If you are using translate.lua - the attention vector for each sentence is in the variable results[b].preds[n].attention (b being the id in the batch, and n id in the n-best list). It is a T x S tensor - where S is the source sentence length, and T the length of this specific translation.
You may run translate.lua to translate with the -save_attention parameter to save attentions to a file.
The visualization program above takes the attention file as an input, and then it nicely prints out the visualization of word alignments with weights for you.
Hi,
I was wondering what the values obtained by saving the attention mean. I’ve used the -save_attention parameter to save the values to a file. What I obtained is something like this:
1 ||| source sentence ||| score ||| target sentence tokenized ||| number number
Matrix like numbers
What does the score mean?
What do the last two numbers mean?
…