Extracting and visualizing the decoder attention weights

(zoulikha) #1

Is there any way to extract and visualize the attention weights for a given parallel sentence in the seq2seq learning framework. As shown in the Figure below each pixel (gray density value between 0 and 1) represents the weight (i,j) of the source token having the index i and the target token having index j.

Bahdanau et al. 2015 “Neural machine translation by jointly learning to align and translate”

(jean.senellart) #2

If you are using translate.lua - the attention vector for each sentence is in the variable results[b].preds[n].attention (b being the id in the batch, and n id in the n-best list). It is a T x S tensor - where S is the source sentence length, and T the length of this specific translation.

(Yuan-Lu Chen) #3

I think this is what you’re looking for:
NMT Attention Alignment Visualizations

You may run translate.lua to translate with the -save_attention parameter to save attentions to a file.

The visualization program above takes the attention file as an input, and then it nicely prints out the visualization of word alignments with weights for you.