Attention mechanism without translation

kargintima · February 24, 2020, 9:16pm

Can I use OpenNMT to get information about attention inside of the sentence?
Without translations. Just how much each word corresponds to every other word in a sentence?

francoishernandez · February 25, 2020, 11:34am

Check this PR and the paper in question : https://github.com/OpenNMT/OpenNMT-py/pull/1615
If you just want to generate alignments you can have a look at fast-align or giza++.

kargintima · February 25, 2020, 9:59pm

Wait, sorry. This is not what I meant.
I want to have links between words in one sentence. Not bilingual pairs.
Just like that:
THE DOG didn’t cross the road because IT was tired
The dog didn’t cross THE ROAD because IT was wide

I want to have matrices of how every words linked to another words.
I know that it is possible with attention mechanism that is basis of transformer networks.

As I can see, fast-align and GIZA++ works with bilingual pairs.

francoishernandez · February 26, 2020, 8:39am

Oh I see.
This could probably be achieved with self-attention from some Transformer LM or GPT-2 indeed (http://jalammar.github.io/illustrated-gpt2/). But none of this is currently implemented in OpenNMT-py. Feel free to PR if you implement some of it.

Bachstelze · February 26, 2020, 5:12pm

You can get the attention matrix from the translation class. But OpenNMT is a seq2seq framework, therefore we need a target sentence. There are many pretrained transformers which you could use to encode the sentence and use the word vectors to construct a tree:

https://nlp.stanford.edu//~johnhew//structural-probe.html?utm_source=quora&utm_medium=referral#the-structural-probe

The linking between words is usually done with dependency trees. If you don’t want to experiment, have a look at spacy.