I am struggling to find out what accum_steps are?
from my understanding… if you have 1 gpu you optimize after (batch_size * accum_count) samples in other words you work as if you are working with a batch of accum_count * mini-batches.
in the documentation it says that accum_steps are
" Steps at which accum_count values change"
why would we change the accum_count? and at what kind of step?