OpenNMT Forum

Using sentence embedding for downstream tasks


(piyush) #1


I want to use the NMT model trained to extract the embedding for the sentences and use them for downstream tasks to validate the accuracy/effectiveness of the embeddings.

There is a similar request raised for the Lua version of OpenNMT. I tried to do a similar thing for opennmt-py, but the visualizations of the embeddings I am getting don’t seem to be quite correct.

Changes made:
Filename: onmt/
Function: _translate_batch function
Line: line 673

    # (1) Run the encoder on the src.
    src, enc_states, memory_bank, src_lengths = self._run_encoder(batch)
    return enc_states[0] #return the last hidden state for the batch

Filename: onmt/
Function: translate
Line: line 227

# in order to format the data as (batch, sequence, num_of_states)
enc_hidden_state = batch_data.permute(1,0,2) 

#flatten the data as (batch, sequence*num_of_states)
enc_flat_hid_state = enc_hidden_state.contiguous().view(enc_hidden_state.size(0), -1)

And enc_flat_hid_state would represent the final embedding for the sentence.

Is this correct? Or did I miss something?

(Guillaume Klein) #2


For an LSTM encoder, look at the PyTorch documentation for the LSTM layer, specifically the output description: