Improve BLEU by Coverage and Context Gate

tintinkool · August 4, 2017, 2:57am

I found that using coverage mechanism significantly improves upon a standard attention-based NMT system by +1.8 BLEU, and incorporating context gate obtains a further improvement of +1.6 BLEU (i.e., +3.4 BLEU in total).
Is there any plan to implement these functions?

jean.senellart · August 5, 2017, 8:13am

Hello, which coverage mechanism are you referring to? For reference, I did implement different coverage mechanisms available here: https://github.com/OpenNMT/OpenNMT/pull/174 - and found significant gains on small dataset (few million sentences) but the effect is wearing off for larger dataset (10M+).

For context gate, we are looking at it…

vince62s · August 5, 2017, 5:31pm

Jean, will you merge this PR or not ?

guillaumekln · August 7, 2017, 10:11am

FYI, context gate is implemented in OpenNMT-py:

jean.senellart · August 10, 2017, 3:59pm

Thanks! I will integrate the context gates too, it is pretty straightforward and will also look at differences between the coverage implementations before merging the PR.

vince62s · August 10, 2017, 4:10pm

Jean,
I just tested the PR on a small system (2x500) and small corpus 500k (fr-en).
I have almost zero improvement with nn10.
what was the task where you saw significant improvement?
thanks.
Vincent

jean.senellart · August 10, 2017, 4:11pm

Chinese to English - on 2M sentence data set.