Error in embedding sequence: SequenceTooLongError

JoyeBright · June 3, 2021, 10:14pm

Hi everyone,

I was training a model with around 31 million sentences employing 3 GPU devices. After step 4200, it got stuck and threw the following error:

onmt.modules.embeddings.SequenceTooLongError: Sequence is 11892 but PositionalEncoding is limited to 5000. See max_len argument.

There is a strange point as I have successfully trained this model with the same configuration, and I only changed batch size from 2048 to 512. Below you can find the train configuration. I was wondering if anyone could pinpoint the problem and a bit explain why this might happen, please?

share_vocab: false
src_vocab: ""
tgt_vocab: ""
save_model: ""
save_checkpoint_steps: 1000
keep_checkpoint: 200
seed: 3435
train_steps: 200000
valid_steps: 1000
warmup_steps: 8000
report_every: 100
decoder_type: transformer
encoder_type: transformer
word_vec_size: 512
rnn_size: 512
layers: 6
transformer_ff: 2048
heads: 8
accum_count: 8
optim: adam
adam_beta1: 0.9
adam_beta2: 0.998
decay_method: noam
learning_rate: 2.0
max_grad_norm: 0.0
batch_size: 512
valid_batch_size: 512 
batch_type: tokens
normalization: tokens
dropout: 0.1
label_smoothing: 0.1
early_stopping: 10
max_generator_batches: 2
param_init: 0.0
param_init_glorot: "true"
position_encoding: "true"
tensorboard: True
tensorboard_log_dir: "/"
log_file: ""
log_file_level: "INFO"
skip_empty_level: "silent"

francoishernandez · June 4, 2021, 8:44am

This is a known issue. Positional encoding of Transformer is limited to 5000 tokens by default.
The error means that you have a sequence longer than 5000 tokens.
You might want to use the filtertoolong transform to set a maximum sequence length.

Also, you could use max_relative_positions instead of positional encoding.

JoyeBright · June 5, 2021, 6:47pm

Thanks for explaining the issue. Any idea why it was trained previously without problem?