Hi I’m trying to improve my project by training on a larger vocabulary (100k) with the TransformerBigRelative, but I get an OOM error even with a 16GB P100.
It suggests adding TF_GPU_ALLOCATOR=cuda_malloc_async to the environment variables, so I tried that in Colab as:
os.environ[‘TF_GPU_ALLOCATOR’] = ‘cuda_malloc_async’
I also tried:
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices(‘GPU’)
for gpu in gpus:
But it still seems to throw the same error message.
I see batch size is configured as 3072 tokens, I guess assuming multi-GPU? I will try reducing it, but maybe it is better to use for example 32 samples instead, so it doesn’t cut off in the middle of a sentence?