I have been training a Transformer model on opennmt-tf, but the loss doesnot seem to reduce even after almost 316000 step, and also the BLEU score is not improving. I am using a dataset containing almost 3.5m training pairs
Can you post the full training log?
I am using google colab and i dont have any log files
Can you provide more information then? The command line you used, the training and model configurations, the current loss and BLEU values, etc.
- Command line: !onmt-main --model_type Transformer --config data.yaml --auto_config train --with_eval
- Training config:
train:
save_checkpoints_steps: 1000
maximum_features_length: 50
maximum_labels_length: 50
batch_size: 4096
eval:
external_evaluators: BLEU
params:
dropout: 0.3
average_loss_in_time: true
infer:
batch_size: 32
“”" - Model:I am using the transformer model from opennmt documentation
4.Current Loss: 2.26 - Bleu score: 10.4
I am running a validation after every 5000 steps, but the model loss is not converging
I suggest removing the dropout
parameter from your configuration. The default dropout values should work well.
ok i will try doing that
One question though is that the reason why the loss is not converging?
It’s one possible reason.