Wait-k decoding?

Hi all.
I’m wondering if there is any implementation of wait-k decoding in OpenNMT-py. It is widely used in simultaneous MT research because it is the easiest way to choose an explicit trade-off between quality and latency.

If not, I would like to implement it and would like to know if you are interested into merging it into the main branch.

I also tag this post with CTranslate2, because if it works with OpenNMT-py, it would be great to have it even faster.


to be clear, I would like to replicate the work in https://aclanthology.org/2021.emnlp-main.473.pdf that uses wait-k decoding over full-sentences to generate pseudo-labels. Not just wait-k decoding of prefixes when the full sentence is still not given