Training a language model

Shruti · September 19, 2017, 1:06am

Hi,

This page http://opennmt.net/OpenNMT/translation/beam_search/#decoding-with-auxiliary-language-model has a warning saying we need to have the same features & dictionary for the LM and Decoder.

I am building a Speech Recognition system and my input is kaldi features to the decoder with -idx_files.

Could you please explain what the warning on the above page means? Is it possible to use a character-based LM and use filter bank features for the decoder, during translation?

Thank you,
Shruti

jean.senellart · September 19, 2017, 5:53am

Hi @Shruti, the warning is referring to target features. So as long as your LM model use the same target features (case?) and vocabulary (characters) than you seq2seq system, it will be ok.