CUDA Out of memory with sentencepiece

Hi all,

I am running into a problem using sentencepiece.
With the following configuration:

# General opts
data:
    corpus_0:
        path_src: data/train.src
        path_tgt: data/train.trg
        transforms: [sentencepiece]
    valid:
        path_src: data/val.src
        path_tgt: data/val.trg
        transforms: [sentencepiece]

### Transform related opts:
#### Subword
share_vocab: true
src_vocab: data/spm_model.bpe.vocab

src_subword_model: data/spm_model.bpe.model
src_subword_nbest: 1
src_subword_alpha: 0.0
#### Filter
world_size: 1
gpu_ranks: [0]

The way the sentencepiece model and vocab is build is: spm.SentencePieceTrainer.train(input=f'{dataDir}/train.src-trg', model_prefix=f'{dataDir}/spm_model.bpe', vocab_size=50000, model_type='bpe')

If the global attention is not set to mlp it runs for a bit and then it fails.

Can it be something to do with loading both the sentencepiece model and the LSTM on the GPU?

Any suggestions will be much appreciated.

Thanks in advance,
Dimitar

The issue is probably not with sentencepiece itself.
You may just have sequences that are too long to fit in memory. You may try:

  • use batch_type tokens
  • add the filtertoolong transform with src_seq_length // tgt_seq_length args.

I tried different combinations. I tried with src/trg length set to 150 as well as to 50.
Reducing the batch size to 64 (batch_type sents) also used LSTM size 256.
The results were all the same - CUDA out of memory.

If I do not specify the transforms [sentencepiece], that is transforms [] then - it works fine.

Will try with setting the batch_type to tokens.

Thanks for the quick reply,
Cheers,
Dimitar