I’m training a model with 4m sentences. The GPU memory usage, as reported by
nvidia-smi is just a little above 2000mb. Few weeks ago, I trained a model with 2m sentences and the memory usage was the same. Is this normal? I’m using a gtx 1080 with 8GB memory.
The GPU memory usage does not depend on the dataset size. However, the memory usage in the main RAM does depend on the number of sentences.
Thanks for your reply @guillaumekln. So, the below quote from the opennmt.net FAQ refers to system RAM?
While in theory you can train on any machine; in practice for all but trivally small data sets you will need a GPU that supports CUDA if you want training to finish in a reasonable amount of time. For medium-size models you will need at least 4GB; for full-size state-of-the-art models 8-12GB is recommend.
If it does, then it should be a bit clearer because now it seems it refers to GPU memory.
This statement is correct as it mentions “model size” and not “data size”. The model size is affected by many parameters, including:
- the type of model (e.g. with a bidirectional encoder the model is bigger)
- the size of vocabulary
- the number of layers
- the size of the hidden dimension
Actual GPU memory usage is also affected by the batch size and the maximal sequence length.
Thank you for the clarification.