I am using the following relevant settings:
Now, according to the documentation, the decay learning rate will be decayed if (i) perplexity does not decrease on the validation set or (ii) steps have gone past start_decay_steps. Indeed, option (ii) seems to work. I print the validation perplexity every 1000 steps and notice a (sharp) increase, but the learning rate is not decreasing. How do I fix this?