Use pretrained sentence embedding model as a part of a bigger seq2seq architecture

Hi everybody!


TL;DR I would like to append an extra feature to each word in the source document, which is the embedding of the entire document given by a specific pre-trained model. Any advice on where to make code changes?


I was trying to replicate Harrison et al. (2018), https://arxiv.org/abs/1809.02637, which is a question generation paper (input: a wikipedia passage from SQuAD; output: a question regarding that passage). In the paper, they train a specific model for embedding the passage, which is supposed to give a question-focused sentence embedding. Each word token in the full encoder-decoder network is the concatenation of its word embedding, extra features (NER, case, …) and the full passage embedding that is output of the aforementioned model.

In the paper they mention they implemented this architecture with OpenNMT-py and PyTorch. I see how to concatenate categorical features to words (http://opennmt.net/OpenNMT/data/word_features/) but what I need here is to run each source sentence through a pre-trained model, and concatenate the output to each of its words.

I have worked before with PyTorch, but never with OpenNMT; I was starting to dive into its code: do you have any advice on how to approach the problem?

Thank you very much!
Dario

1 Like

Hi,

Did you try to contact the authors of the paper? Maybe they can share details of their implementation (or even open source it).

First pointer: the model inputs are built in the file:

Hi Guillaume, I already asked the authors but unfortunately they don’t plan to release their implementation.

Thank you for the pointer though, it helped me navigate the code: I think I will extend the Embeddings class to add the extra processing step and concatenate the features.

1 Like

Hi Dario,

were you able to solve your problem?
I have a similar one and would appreciate any advice.
Thanks