Minimum Risk Training for NMT

Hi all,

First of all, thanks for your contribution with this project. I would like to ask if you already have any plan to implement this feature: Minimum risk training for Neural Machine Translation. I believe that it would be very helpful at the end of training, when perplexity and BLEU doesn’t correlate anymore.

Honestly I don’t quite understand the backward pass for this algorithm, so maybe using PyTorch with autograd is a bit easier than Torch.