Greetings , Fellow Researchers!
I have come across many use cases where many have used multi-source transformer model instead of multi-feature transformer model.
I want some help as to when should we use multi-feature and when should we use multi-source.
Thanks & Regards
It’s important to know the architectural differences:
- multi-feature: additional embeddings (e.g. from POS, lemma, case, etc.) are concatenated to the main word embeddings before encoding.
- multi-source: multiple inputs sequences are encoded independently and the decoder combines the attention representation of each source (e.g. the actual source and a pretranslation).
So you should use:
- multi-feature: to augment the word contexts with additional information
- multi-source: to decode from multiple and independent source sequences