What’s the motivation behind not allowing metadata for destination language, as we allow for source language? Having metadata for every word might help improve accuracy.
Could you describe your expectation for this feature?
Should it only be an input feature or should it also be an output trained with MSE (or other) and fed back as input to the next time step?