Reduction in size of checkpoints: `--train_from` arg

A5U7 · August 10, 2020, 6:17am

I have been noticing a reduction in size of new checkpoints when retraining on a new dataset from a saved (custom) pretrained version.

Pretrained weight:

Here the size reduces from 108M to just 36M.
Below is the train.py run with --train_from arg set to 108M file.

Also, while its besides the point, there’s no change in dictionary as I am using a char level model

guillaumekln · August 12, 2020, 6:45am

Did you change the optimization method?

A5U7 · August 12, 2020, 10:07am

I have switched from Adam to sgd. Could that be the reason? Because the reduction is to 1/3rd of previous version.

guillaumekln · August 12, 2020, 10:18am

Yes, that’s the reason. Adam comes with additional parameters for each model weight.