I’m training a model with 4m sentences. The GPU memory usage, as reported by nvidia-smi is just a little above 2000mb. Few weeks ago, I trained a model with 2m sentences and the memory usage was the same. Is this normal? I’m using a gtx 1080 with 8GB memory.
While in theory you can train on any machine; in practice for all but trivally small data sets you will need a GPU that supports CUDA if you want training to finish in a reasonable amount of time. For medium-size models you will need at least 4GB; for full-size state-of-the-art models 8-12GB is recommend.
If it does, then it should be a bit clearer because now it seems it refers to GPU memory.