Understand output file of "Infer"

mayaKaplansky · February 1, 2021, 6:50pm

Hi
I have an output file of “inter” and I want to make sure I understand it.
I set n_best= 5 so I can see I have 5 times the number of rows than my input.
I assume each 5 rows is a prediction for one input, ordered from best to worst?
Why do some rows have more than one prediction? And what does it mean 11 <s> 11?

-4.144495 ||| 10
-0.374117 ||| 69
-2.997007 ||| 82
-4.280146 ||| 61
-4.623057 ||| 95
-4.841951 ||| 91
-0.243760 ||| 11
-4.461693 ||| 11 11
-5.461799 ||| 11 98
-5.543235 ||| 11 91
-5.651481 ||| 11 <s> 11

Also, is the order of predictions same as the source file? So I can measure accuracy?
Thanks

guillaumekln · February 2, 2021, 8:22am

Yes.

That’s what you model predicted. If it is not what you expected, it usually means the model is not trained enough, or there was not enough data.

Yes.

mayaKaplansky · February 2, 2021, 8:35am

Thank you. Is it possible to use "sequence_controls" to define that <s> and <\s> are not part of the vocabulary? Or are they always valid predictions?

guillaumekln · February 2, 2021, 1:41pm

They are always possible predictions, but they should be highly unlikely if the model is correctly trained.