OpenNMT Forum

Hybrid model training

Hi,

I am trying to understand on how to create and train a hybrid model (encoder from transformer and a single layer RNN decoder) using openNMT-tf.
To be specific, what should I do if I want to implement my own decoder ?
Are there any materials/documentation available for reference to customize the models?

Hi,

You should provide a custom model definition, following the template described in the documentation: https://opennmt.net/OpenNMT-tf/model.html#custom-models

Here is a possible model definition based on your short description:

import opennmt

class MyHybridModel(opennmt.models.SequenceToSequence):
    def __init__(self):
        super().__init__(
            source_inputter=opennmt.inputters.WordEmbedder(embedding_size=512),
            target_inputter=opennmt.inputters.WordEmbedder(embedding_size=512),
            encoder=opennmt.encoders.SelfAttentionEncoder(
                num_layers=6,
                num_units=512,
                num_heads=8,
                ffn_inner_dim=2048),
            decoder=opennmt.decoders.AttentionalRNNDecoder(
                num_layers=1,
                num_units=512)
        )

model = MyHybridModel

The classes used in this code snippet are documented here:

https://opennmt.net/OpenNMT-tf/package/opennmt.html