In the pytorch version of OpenNMT, the last target is excluded from the inputs to the decoder. What is the reason behind doing so?
At every iteration
i during decoding, let the decoder cell
dec[i] receive an input
inp[i] to produce an output
out[i]. I earllier suspected that
tgt[-1] is not needed because we’re feeding in
inp[i] at every iteration
dec gets some input
dec receives some combination of
inp and so on.
Assuming the last iteration is
t, this way the last decoder cell,
inp[t], so there’s no need of the last target
tgt[t]. But it turns out I was wrong.
This code suggests that
inp[i] is some combination of
tgt[i] and not
tgt[i-1]. If this is the case, why do we drop the last target?