Use pretrained sentence embedding model as a part of a bigger seq2seq architecture

dario · February 28, 2019, 11:21am

Hi everybody!

TL;DR I would like to append an extra feature to each word in the source document, which is the embedding of the entire document given by a specific pre-trained model. Any advice on where to make code changes?

I was trying to replicate Harrison et al. (2018), https://arxiv.org/abs/1809.02637, which is a question generation paper (input: a wikipedia passage from SQuAD; output: a question regarding that passage). In the paper, they train a specific model for embedding the passage, which is supposed to give a question-focused sentence embedding. Each word token in the full encoder-decoder network is the concatenation of its word embedding, extra features (NER, case, …) and the full passage embedding that is output of the aforementioned model.

In the paper they mention they implemented this architecture with OpenNMT-py and PyTorch. I see how to concatenate categorical features to words (http://opennmt.net/OpenNMT/data/word_features/) but what I need here is to run each source sentence through a pre-trained model, and concatenate the output to each of its words.

I have worked before with PyTorch, but never with OpenNMT; I was starting to dive into its code: do you have any advice on how to approach the problem?

Thank you very much!
Dario

guillaumekln · March 2, 2019, 9:06am

Hi,

Did you try to contact the authors of the paper? Maybe they can share details of their implementation (or even open source it).

First pointer: the model inputs are built in the file:

github.com

OpenNMT/OpenNMT-py/blob/master/onmt/modules/embeddings.py

""" Embeddings module """
import math
import warnings

import torch
import torch.nn as nn

from onmt.modules.util_class import Elementwise


class PositionalEncoding(nn.Module):
    """Sinusoidal positional encoding for non-recurrent neural networks.

    Implementation based on "Attention Is All You Need"
    :cite:`DBLP:journals/corr/VaswaniSPUJGKP17`

    Args:
       dropout (float): dropout parameter
       dim (int): embedding size
    """

This file has been truncated. show original

dario · March 4, 2019, 9:02am

Hi Guillaume, I already asked the authors but unfortunately they don’t plan to release their implementation.

Thank you for the pointer though, it helped me navigate the code: I think I will extend the Embeddings class to add the extra processing step and concatenate the features.

geckuba · November 16, 2020, 8:34pm

Hi Dario,

were you able to solve your problem?
I have a similar one and would appreciate any advice.
Thanks