What's the difference between model.zero_grad() and optim.zero_grad()

pytorch

(Zeng) #1

What’s the difference between model.zero_grad() and optim.zero_grad()?
It seems that we oftern use the second one.


(Guillaume Klein) #2

That’s not specific to OpenNMT. See:

tl;dr: they are the same.