Sequence tagging

jean.senellart · March 13, 2017, 11:32pm

Some of OpenNMT users have mentioned using seq2seq models for tagging purpose, which is a simpler problem requiring just encoder and generator.

A first version is available here: https://github.com/OpenNMT/OpenNMT/pull/155

@Wabbit - do you want to try?

Wabbit · March 17, 2017, 11:42am

@jean.senellart sure I’ll try and provide feedback. However, why do you think attention might not be required for tagging? The last hidden layer of the encoder might not carry enough information and additional access to the source side hidden states might be required. See a recent NAACL paper Neural Architectures for Named Entity Recognition which feeds in the hidden state from source at timestep t when decoding for step ‘t’

jean.senellart · April 25, 2017, 10:05pm

Hi @Wabbit - just saw your question. Indeed we could also think about having an attention module in a sequence tagger or use the full seq2seq approach for tagging - but the sequence tagger proposed here is a more simple (and conventional) approach…

tintinkool · August 1, 2017, 12:37am

hi @jean.senellart ,

Do you have any papers or document related to Sequence tagging that you implemented on ONMT?
I want to know more about this feature.

Thank a lot.

pankajkumar · November 7, 2017, 11:03am

Hello @tintinkool
Have you found a way to implement sequence tagging using OpenNMT? I am looking for any examples/documentation for the same.

jean.senellart · November 7, 2017, 11:19am

@pankajkumar - sequence tagging is part of the toolkit.

Your source corpus is your tokenized text, your target corpus is your tag sequence.
Apply generic preprocess on it (use -check_plength option to make sure source and target have the same lengths)
Train a model using -model_type seqtagger.
infer POS with trained model using tag.lua
most of the regular options are working (except that you don’t have a decoder)

see: http://opennmt.net/OpenNMT/applications/#sequence-tagging

let me know if you have any problem to run this.

Etienne38 · November 9, 2017, 3:16pm

Can the seqtagger model:

be run in server mode ?
be converted to run in CPU mode ?

jean.senellart · November 9, 2017, 4:10pm

not today - but adding a wrapping like the translation server is trivial

yes - the same way than seq2seq models

Etienne38 · November 9, 2017, 5:21pm

It would be great. Otherwise, tagging sentences on demand, one by one, would force to load the model for each sentence, isn’t it ? A bit heavy…

lucien0410 · May 6, 2018, 2:49am

Hi,

Do you know if this seqtagger feature is available for the python version?
I search the documentation of the python version, and didn’t find this feature.
Does the team plan to implement this feature for the python version?

Thanks!

guillaumekln · May 14, 2018, 7:28am

Hello,

It is not included in the PyTorch version but you can find it in the TensorFlow version if Python is your only requirement.