Implement Copy Mechanism

Hi All,

I was considering add extension to the current OpenNMT for my research regarding generation/summarization, such as the copy mechanism (https://arxiv.org/pdf/1603.06393.pdf). I wonder if it is easy to add on top of the current openNMT system (and how?), or should I try some other frameworks (Theano or Tensorflow)?

Thanks,

BTW, the current system works pretty well on my task, compared with the Tensorflow seq2seq (without attention mechanism and beam search decoding), I wonder whether it’s because of the already fine-tuned hyper-parameters and architecture of the openNMT system.

We personally like it a lot better than other frameworks :slight_smile: The Tensorflow seq2seq system is quite simple and is missing a lot of opennmt’s features.

We don’t have copy mechanism implemented yet, but we would love to add it. There has been informal discussion about this and other similar copy papers.

Please feel free to join our chat channel (https://gitter.im/OpenNMT/openmt). Why don’t you read over the attention code here (http://opennmt.net/OpenNMT/code/modules/onmt+modules+GlobalAttention/) and propose a plan for how you would add a general purpose copy mechanism?

Maybe I could help with it after the ACL submission :stuck_out_tongue:

1 Like

This would be great! let us know if you need any help.

I think this has been implemented already.
See https://github.com/OpenNMT/OpenNMT-py/issues/245
It’s based on the following paper though.
https://arxiv.org/abs/1704.04368