Bi-Directional Bottom Encoder

jchavezberkeley · May 4, 2021, 3:56pm

In Google’s Machine Translation Model, there are 8 layers of LSTMs, but only the bottom is bi-directional. Is there anyway to specify that in my yaml file? I see that in this post, Different RNN architecture for each layer of Encoder, Google’s Model has been added. So can I just specify my encoder type to be gnmt and then still specify 8 layers total? Thanks

I know there is the bi-dir flag but if I set that to true, will that make all 8 LSTM stacks bidirectional? Thanks

francoishernandez · May 5, 2021, 7:56am

The post you’re referring to is about OpenNMT lua, not OpenNMT-py.
There is the encoder_type: brnn flag for OpenNMT-py, but it won’t allow much customization. It would probably not be too difficult to adapt though.

jchavezberkeley · May 5, 2021, 5:56pm

What about setting bi_dir to true? Would it make all layers bi directional?

francoishernandez · May 6, 2021, 7:49am

I don’t know of any bi_dir flag. Please be more specific.

jchavezberkeley · May 6, 2021, 2:44pm

Sorry, I meant the --bidir_edges, -bidir_edges flag in the Train — OpenNMT-py documentation

But overall, you recommend using the brnn encoder type even though I would only want one bottom bidirectional LSTM layer?

francoishernandez · May 7, 2021, 3:02pm

Sorry, I meant the --bidir_edges, -bidir_edges

This flag is for the GGNN implementation.

But overall, you recommend using the brnn encoder type even though I would only want one bottom bidirectional LSTM layer?

It would probably depend on your task and what you’re trying to achieve.
Also, why not using Transformer?

jchavezberkeley · May 7, 2021, 3:26pm

So I’m actually trying to compare a transformer model, like the one in “Attention Is All You Need” to a Google NMT LSTM model with some Spanish to English data. I’ve trained and built a transformer already and now I need something that can somewhat mimic the model of Google NMT.