Blank lines in prediction file

gauravsh0812 · May 10, 2021, 3:36pm

Hi there,

I have trained my model using GitHub - harvardnlp/im2markup: Neural model for converting Image-to-Markup (by Yuntian Deng github.com/da03) the dataset provided in the documentation (100K). When tested, I have got a “.tmp.pred.txt” containing all the predicted equations along with many blank lines. As mentioned in the documentation itself (under Evaluation section):
"Note that although the predicions file contains the gold labels, since some images (e.g., too large sizes) will be ignored during testing, to make the comparison fair, we need to use the test file again and treat those that does not appear in predictions file as blank predictions. "

My doubt is that: "out of 10355, the model was able to predict only 2040 equation, while ~8000 equations were replaced by blank lines. Is there any problem with my model (I am following the exact cmds mentioned in the documentation) or this is expected behavior ? "

Any suggestions would be appreciated. Thank you in advance!

guillaumekln · May 10, 2021, 3:48pm

Hi,

I think it would be better to contact the author of this paper directly. You can find an email address on his GitHub profile: