Is coverage loss supported?

I see that one of the train options is to use coverage loss (lambda_coverage) based on the See et al (2017) paper. However, I don’t see where this is implemented and I don’t see any changes in my training when I use this option. I am using the transformer model.
Is it currently supported?

@pltrdy Could you comment on that? Thanks.