outputs.backward(gradOutput)

xjtu-zeng · May 15, 2017, 5:41pm

what does this gradOutput means, why doing this step?

guillaumekln · May 16, 2017, 1:27pm

gradOutput is self explanatory: it is the gradients of the model outputs. It is the backpropagation step.

xjtu-zeng · May 17, 2017, 6:50am

In the memory EfficientLoss, there is a loss_t.backward(), any suggestion about this two backward()? As i usually see about just one backward function to bacward the loss

guillaumekln · May 17, 2017, 7:31am

The forward-backward passes of the generator are done separately for memory reasons. So the workflow is as follow:

Forward whole sequence into encoder and decoder
Forward and backward each generator timestep
Backward whole sequence into decoder and encoder

xjtu-zeng · May 29, 2017, 9:10am

Thanks, your interpretation is great. It’s helpful

xjtu-zeng · June 22, 2017, 3:28am

Hi, I have combined the two backward() together. As you said, the memory Error comes. It’s intuitive that this way costs more memory, but can you give me some quantitative example or calculation, because I don’t know how to measure my model and parameters whether they are too big or not. That will be very appreciated.

xjtu-zeng · June 22, 2017, 4:35am

Hey, dude it’s very appreciated that you can explain how the code guarantee that the backward can be split into two part or even more. Thank you very much.