What is the difference between multi-feature and multi-source?

Greetings , Fellow Researchers!

I have come across many use cases where many have used multi-source transformer model instead of multi-feature transformer model.

I want some help as to when should we use multi-feature and when should we use multi-source.

Thanks & Regards


It’s important to know the architectural differences:

  • multi-feature: additional embeddings (e.g. from POS, lemma, case, etc.) are concatenated to the main word embeddings before encoding.
  • multi-source: multiple inputs sequences are encoded independently and the decoder combines the attention representation of each source (e.g. the actual source and a pretranslation).

So you should use:

  • multi-feature: to augment the word contexts with additional information
  • multi-source: to decode from multiple and independent source sequences

Thank you @guillaumekln