Variable Dropout setting

vince62s · January 16, 2017, 7:19pm

Dear fantastic support !

I would like to implement a scheduled dropout.

Ideally here is what I would like:

If we train during N epochs of n minibatch of sentences, which means N x n minibatches,
I would like to be able to start training with a dropout schedule as follow:
(examples but we may generalize with variable value for %)
between 0% and 25% of minibatches, dropout linearly change from a value a(0%) to a(25%)
between 25% and 50% of minibatches, dropout linearly change from a(25%) to a(50%)
again between 50 and 75, then 75% to 100%.

-dropout-schedule could kind of a string with such a schedule.

I hope this request is clear enough.

Vincent

jean.senellart · January 16, 2017, 8:10pm

Thanks Vincent for the detailed request. We want to implement more generally a notion of “training schedule” (I like the name) - that we can use to drive all the parameters during the training - typically optim method, learning_rate, but also guided alignment, number of parallel threads, boosting parameters, dropout, and whatever parameters that can change during the training. Also - we are considering dropping the notion of “epoch” to move rather to “steps” as other systems are doing. I will put some specs together and come back to you.

guillaumekln · January 17, 2017, 3:31pm

Your description implies that the dropout is increasing over the course of the training. Did you mean the opposite?

vince62s · January 17, 2017, 3:57pm

nope. it did not imply anything. just linear between two values for each segment.

here is my belief:
If we start with a high dropout, it’s too slow to converge.
But we need a higher dropout at some point, then I think by decreasing again it helps to get better performance.

So for example I would lilke to try this.
we start at 0 increase to 0.4 stay at 0.4 for some time then decrease again down to X (maybe 0).

it seems to give some interesting results in another context.