Error out of range when input length is greater than output length [feattext mode]

tintinkool · July 28, 2017, 1:28am

Dear all,

I try to run a feattext mode. I have a problem when input length is greater than output length. The code in ./onmt/SeqTagger.lua: line 145 gave a out of range error. I don’t know why we get the target output based on the length of the input. Can any one help me?

Thanks alot

function SeqTagger:trainNetwork(batch)
…
– For each word of the sentence, generate target.
for t = 1, batch.sourceLength do
local genOutputs = self.models.generator:forward(context:select(2, t))

local output = batch:getTargetOutput(t)

Shruti · July 29, 2017, 1:59pm

For sequence tagger, you would need to have the same length in source and target. Here is a sequence tagger tutorial of ONMT: Monophone speech recognition with OpenNMT

tintinkool · July 31, 2017, 12:02am

Thanks for your answer. Any one know how to handle when the input length and output length are different.

Thank a lot.

jean.senellart · July 31, 2017, 2:30am

Hi @tintinkool, if your source and target lengths are different, then it is not a sequence tagging problem since by definition, sequence tagging assign one output to each source input.
However, if it is a some data cleanup issue, you can use the option -check_plength [<boolean>] in preprocessing which will make sure that the preprocessed data has source/target identical length.