Is it normal for the preprocessed data model (~54GBs in my case) to be loaded into CPU memory ? During training, my RAM is 100% full and the swap also fills up eventually, slowing down the system. I am assuming that the data.t7 file is loaded into RAM. Should that happen?
yes - without dynamic dataset, all of .t7 is loaded in CPU memory - so it will be an issue if your don’t have enough RAM. good workaround is the dynamic dataset.
I have encountered the same issue for speech recognition. I am thinking of another workaround such as divided preprocessed data is iteratively trained at a time with every stop/resume operations per epoch. There must be additional time consumption to load individual preprocessed data. Also, we will need to plug in full vocabulary because divided preprocessed data will produce different vocabulary sizes. I am working on this workaround but looks taking some time. I will let you know if I find solid result.