I have noticed that there are two parameters controlling the length of decoding sequence and I am aware that maximum_labels_length controls the max decoder-side sequence during training.
But I am confused about maximum_iterations:
- Does it matter when training a model
- When doing inference using a trained model (say ckpt), can I modify it to a bigger value in order to make my model translate longer sentences?
- Does it interfere with maximum_labels_length?