Sequence tagging

Some of OpenNMT users have mentioned using seq2seq models for tagging purpose, which is a simpler problem requiring just encoder and generator.

A first version is available here: https://github.com/OpenNMT/OpenNMT/pull/155

@Wabbit - do you want to try?

@jean.senellart sure I’ll try and provide feedback. However, why do you think attention might not be required for tagging? The last hidden layer of the encoder might not carry enough information and additional access to the source side hidden states might be required. See a recent NAACL paper Neural Architectures for Named Entity Recognition which feeds in the hidden state from source at timestep t when decoding for step ‘t’

Hi @Wabbit - just saw your question. Indeed we could also think about having an attention module in a sequence tagger or use the full seq2seq approach for tagging - but the sequence tagger proposed here is a more simple (and conventional) approach…

hi @jean.senellart ,

Do you have any papers or document related to Sequence tagging that you implemented on ONMT?
I want to know more about this feature.

Thank a lot.

Hello @tintinkool
Have you found a way to implement sequence tagging using OpenNMT? I am looking for any examples/documentation for the same.

@pankajkumar - sequence tagging is part of the toolkit.

  • Your source corpus is your tokenized text, your target corpus is your tag sequence.
  • Apply generic preprocess on it (use -check_plength option to make sure source and target have the same lengths)
  • Train a model using -model_type seqtagger.
  • infer POS with trained model using tag.lua
  • most of the regular options are working (except that you don’t have a decoder)

see: http://opennmt.net/OpenNMT/applications/#sequence-tagging

let me know if you have any problem to run this.

1 Like

Can the seqtagger model:

  • be run in server mode ?
  • be converted to run in CPU mode ?

not today - but adding a wrapping like the translation server is trivial

yes - the same way than seq2seq models

1 Like

It would be great. Otherwise, tagging sentences on demand, one by one, would force to load the model for each sentence, isn’t it ? A bit heavy…

Hi,

Do you know if this seqtagger feature is available for the python version?
I search the documentation of the python version, and didn’t find this feature.
Does the team plan to implement this feature for the python version?

Thanks!

1 Like

Hello,

It is not included in the PyTorch version but you can find it in the TensorFlow version if Python is your only requirement.