Input vector as input and branch encoder

ZexCeedd · April 10, 2017, 3:49am

Hi, can I double check:

if I wish to input a vector instead of the embedding by the embedding layer, I can achieve this via the “loading pretrained embeddings method” when I am actually loading in my own vectors?
I’m interested in branching e.g. 2 encoders to 1 decoder, meaning that the encoded hidden state passed to the decoder can be concat or sum of the multiple encoder. Also, each encoder can have their own set of inputs and able to load it’s individual pretrained embeddings or normal input vectors if point (1) is correct. Just wondering if work has been done on this.

Thanks!

guillaumekln · April 10, 2017, 7:35am

Hi,

See here: http://opennmt.net/OpenNMT/training/embeddings/
No work has been done to implement this but we could add it in the future. For example, the -brnn option is actually a combination of two independent encoders processing the input in different direction.

jean.senellart · April 10, 2017, 8:48am

Hi, what is the use case you have in mind? we have been discussing internally for a while the possibility of plugging multiple encoders - and I am interested to know the different use cases before we move ahead in the implementation. As Guillaume mentions, the structure is already available for such extension but we would like the design to be as flexible as possible.

ZexCeedd · April 10, 2017, 11:06am

Thanks for the fast replies!

I’m trying to extend seq2seq to other applications, not on NMT. IMO, the main benefit of multiple encoders is to deal with different forms of sources. For example, in IoT case, one encoder could encode temperature information, and another can encode sounds waves and so on. It happens that I have a sequential output of severity codes. Tossing these two forms of data of different sources and value scales in a single encoder does not seem ideal as compared to concatenating the encoded hidden state. Unfortunately I am not good on Lua to contribute technically.

By the way, I’m very interested in the word features capability http://opennmt.net/OpenNMT/data/word_features/.
Appreciate if you can help me confirm my understanding:

The word features embedding are optimized the same way as a normal embedding for a word in NMT through gradient updates. If the word embedding is 100 dim with a feature embedding of 50 dim, this means it is updated as if it is 150 dim “word embedding” just that the update for the 50 dim goes to the feature embedding.
If I turn fixed embedding on (e.g. fix_word_vecs_enc), does it still update both the word and feature embedding?

Thanks!

guillaumekln · April 10, 2017, 11:27am

Yes, word features embedding are optimized the same way the word embeddings are. All embeddings are then concatenated (by default) and feed to the RNN.
Currently, fixing embeddings only works for word embeddings. Word features embeddings are still optimized when this flag is enabled.

Etienne38 · April 10, 2017, 1:03pm

Also : target features are time shifted. To avoid this, you need this mod:

jean.senellart · April 10, 2017, 1:14pm

yes - it is also the type of use cases we are thinking about - we are currently working on few other features, but we will keep in mind this request in the near future.

ZexCeedd · April 12, 2017, 11:49am

Is this an issue only if I include feature embeddings for the target side?
Currently, I’m interested in only the encoder side.

Does this means it is fine?

Etienne38 · April 13, 2017, 3:20pm

As far as I know, time shifting is only on the decoder side.

rahular · August 31, 2017, 11:22am

Also if this branching is implemented, the ability to create architectures like these (https://arxiv.org/pdf/1504.07225.pdf) will make openNMT extremely powerful

jean.senellart · August 31, 2017, 5:46pm

Hi @rahular - thanks for sharing this paper - I am adding it to my list…

Shruti · September 28, 2017, 7:41pm

Hi jean,

Has this feature been pursued yet? I am also interested in using these type of models, for multimodal speech recognition. Specifically, I would like to combine the pyramidal encoder for speech and the cnn encoder for vision etc.

guillaumekln · October 2, 2017, 7:25am

Hi Shruti,

Currently, we have no plans to add this feature, at least in OpenNMT Lua. However, we are working on a new project that will support these types of models. Stay tuned!

guillaumekln · November 3, 2017, 1:59pm

This is supported in OpenNMT-tf.

See for example the multi-source NMT model. One of the input file can be changed to serialized real vectors.

Shruti · November 3, 2017, 4:39pm

Thanks Guillaume! OpenNMT-tf is exciting!