Increasing evaluation loss but BLEU score also increasing

Hi, I have a very strange behavior with the bi-directional RNN network. I trained my translation model with 100k sentence pairs and after some steps, the evaluation loss is increasing.


However, when I test the model with my test dataset at checkpoint step 40k, I get the BLEU of 16.54. I did the same with the model at checkpoint step 100k and I get BLEU of 18.25.
To my knowledge, evaluation loss increases means the model is overfitting and should perform worse when the model meet new sentences. Somehow the model in this experiment can perform even better although evaluation loss is increasing.

Is this behavior normal?

This is my config file for my experiment

model_dir: model

data:
  eval_features_file: cv.dev.cn
  eval_labels_file: cv.dev.vn
  source_words_vocabulary: src-vocab.txt
  target_words_vocabulary: tgt-vocab.txt
  train_features_file: cv.train.cn
  train_labels_file: cv.train.vn
eval:
  batch_size: 32
  eval_delay: 0
  exporters: last
infer:
  batch_size: 32
  bucket_width: 0
params:
  average_loss_in_time: true
  beam_width: 5
  learning_rate: 0.001
  clip_gradients: 5
  label_smoothing: 0.1
  length_penalty: 0
  gradients_accum: 1
  optimizer: AdamOptimizer
  optimizer_params:
    beta1: 0.9
    beta2: 0.999
  decay_type: exponential_decay
  decay_params:
    decay_rate: 0.5
    decay_steps: 10000
    staircase: true
  decay_step_duration: 1
  start_decay_steps: 50000
score:
  batch_size: 64
train:
  average_last_checkpoints: 8
  batch_size: 4096
  batch_type: tokens
  keep_checkpoint_max: 8
  maximum_features_length: 50
  maximum_labels_length: 50
  sample_buffer_size: -1
  save_checkpoints_steps: 5000
  save_summary_steps: 100
  train_steps: 200000

And my model

import tensorflow as tf
import opennmt as onmt

def model():
  return onmt.models.SequenceToSequence(
      source_inputter=onmt.inputters.WordEmbedder(
          vocabulary_file_key="source_words_vocabulary",
          embedding_size=512
          ),
      target_inputter=onmt.inputters.WordEmbedder(
          vocabulary_file_key="target_words_vocabulary",
          embedding_size=512
          ),
      encoder=onmt.encoders.BidirectionalRNNEncoder(
          num_layers=2,
          num_units=512,
          reducer=onmt.layers.ConcatReducer(),
          cell_class=tf.nn.rnn_cell.LSTMCell,
          dropout=0.2,
          residual_connections=False
          ),
      decoder=onmt.decoders.AttentionalRNNDecoder(
          num_layers=2,
          num_units=512,
          bridge=onmt.layers.CopyBridge(),
          attention_mechanism_class=tf.contrib.seq2seq.LuongAttention,
          cell_class=tf.contrib.rnn.LSTMCell,
          dropout=0.2,
          residual_connections=False))

Hi,

There is no direct relation between loss and BLEU so there is nothing abnormal in this observation.