Attention and alignment problems

a.cerda · July 22, 2020, 11:53am

Hello,
We are having problems with the attention matrix and the alignment matrix.

When batch translating (translator.translate_batch()) the second dimension of the matrix in the answer corresponds to the shortest source, i.e. If we translate “Hello” and “I like red cars” in a batch and get “Hola” and “Me gustan los coches rojos”, we are getting a 2x1 matrix for the first sentence and a 6x1 matrix for the second, instead of 6x4.

On the other hand, when we activate the alignment debug option we get the following error:

onmt/translate/translator.py", line 408, in translate
align = trans.word_aligns[0].tolist()
TypeError: ‘NoneType’ object is not subscriptable

or using report align:

onmt/decoders/transformer.py", line 101, in forward
if self.alignment_heads > 0:
TypeError: ‘>’ not supported between instances of ‘NoneType’ and ‘int’

Your help is greatly appreciated, thanks.

francoishernandez · July 22, 2020, 2:07pm

@Zenglinxiao might have some inputs here.

Zenglinxiao · August 2, 2020, 7:36pm

Hello,
Sorry for the delay. For your question, If you want to get the alignments related to -align_debug, you should use both -align_debug and -report_align with a guided alignment Transformer, normal Transformer should also work though.
For your error relate to -align_debug, it seems that you didn’t enable -report_align.
For your error related to -report_align, to me, it appears weird, as alignment_heads should always be int type (default to 0 as in options). Could you please check your loaded model’s options? Seems like you are using a model trained under an ancient version OpenNMT without a guided alignment feature.
Maybe for your question, you should use -attn_debug rather than -align_debug, as the attention matrix is what used for generating the sentence.