It’s not difficult to expand/convert from e.g. layers =2, rnn_size = 512 to layers = 4, rnn_size = 1024.
- you may output all the parameters in each encoder/decoder for different configurations.
- load the previous model
- create a new encoder/decoder
- copy or expand corresponding parameters for each matrix.
- train with the newly generated encoder/decoder
BTW, personally, I don’t think it is a good way to directly jump from 2512 to 41024.
It’s really big for 41024 network. Maybe you can start from 2512 to 4*512 for instance.
That will be much easier.