Improve BLEU by Coverage and Context Gate


(DucAnhLe) #1

I found that using coverage mechanism significantly improves upon a standard attention-based NMT system by +1.8 BLEU, and incorporating context gate obtains a further improvement of +1.6 BLEU (i.e., +3.4 BLEU in total).
Is there any plan to implement these functions?


(jean.senellart) #2

Hello, which coverage mechanism are you referring to? For reference, I did implement different coverage mechanisms available here: https://github.com/OpenNMT/OpenNMT/pull/174 - and found significant gains on small dataset (few million sentences) but the effect is wearing off for larger dataset (10M+).

For context gate, we are looking at it…


(Vincent Nguyen) #3

Jean, will you merge this PR or not ?


(Guillaume Klein) #4

FYI, context gate is implemented in OpenNMT-py:


(jean.senellart) #5

Thanks! I will integrate the context gates too, it is pretty straightforward and will also look at differences between the coverage implementations before merging the PR.


(Vincent Nguyen) #6

Jean,
I just tested the PR on a small system (2x500) and small corpus 500k (fr-en).
I have almost zero improvement with nn10.
what was the task where you saw significant improvement?
thanks.
Vincent


(jean.senellart) #7

Chinese to English - on 2M sentence data set.