Does the network get confused if my dataset has sometimes multiple translations for the same source sentence especially with short sentences?
By confused I mean, how will it decide which is the best translation for the same sentence?
Will choose it randomly or will the matrix decide which is best?
It’s hard to tell what will be the exact outcome as it depends on many things. The expected translation should be the most frequent one but that will not necessarily be the case.
However, the model will never choose randomly but always according to its internal representation.