I’m working with OpenNMT -py on a multiple modality to sequence task. Essentially summarization, but from text as well as numerical attributes. I want to use a mean encoder over the attribute section of the source but a bi-directional rnn over the text segment of the source.
Is anyone familiar with using more than one encoder in a model? Are there any resources I can look at?
OpenNMT-py does not support multiple input modalities so it might be tricky to pass your inputs to the model. Are you considering implementing your own training script and reusing some OpenNMT-py building blocks (e.g. the encoders)?
However, if you want to rely as much as possible on existing code and don’t mind running other frameworks, OpenNMT-tf has a built-in support for that.