Is it possible to get attention output of other n_best hypotheses?

alpoktem · March 27, 2019, 5:18pm

Hi,

I have a script that processes the attention output after translation. Using the -attn_debug flag while translation I am able to get the attention weights matrix of the best hypothesis.

I was wondering if I could somehow get the attention outputs for another hypothesis that the decoder outputs.

For example: I give as input

[‘▁so’, ‘▁look’, ‘▁around’, ‘.’]

I can see the 3 best hypothesis and the attention output of the best hypothesis using --n_best 3 --attn_debug flags. It would go like:

BEST HYP:
[-1.9307] [‘▁mira’, ‘.’]
[-3.7469] [‘▁mira’, ‘▁alrededor’, ‘.’]
[-4.1960] [‘▁así’, ‘▁que’, ‘▁mira’, ‘.’]
▁so ▁look ▁around .
▁mira *0.4914136 0.2727459 0.1846985 0.0511420
. 0.0559777 0.0276392 *0.5226318 0.3937513
/s 0.0784556 0.0684023 0.1030885 *0.7500536

I am interested in getting the attention output of the second hypothesis though. Would this possible somehow?

guillaumekln · March 28, 2019, 4:29pm

Hi,

The translation API should return it. So I suggest looking at the code to output the other attention vectors:

github.com

OpenNMT/OpenNMT-py/blob/0.8.2/onmt/translate/translator.py#L341


self.out_file.flush()


if self.verbose:
    sent_number = next(counter)
    output = trans.log(sent_number)
    if self.logger:
        self.logger.info(output)
    else:
        os.write(1, output.encode('utf-8'))


if attn_debug:
    preds = trans.pred_sents[0]
    preds.append('</s>')
    attns = trans.attns[0].tolist()
    if self.data_type == 'text':
        srcs = trans.src_raw
    else:
        srcs = [str(item) for item in range(len(attns[0]))]
    header_format = "{:>10.10} " + "{:>10.7} " * len(srcs)
    row_format = "{:>10.10} " + "{:>10.7f} " * len(srcs)
    output = header_format.format("", *srcs) + '\n'