As I can see , in OpenNMT v0.8 release new validation metrics were implemented like BLUE , I was wondering if it is possible to implement ROUGE metric too , for those who are interested in text summarization.
The issue with ROUGE is that it is not just one number.
If you want to look at this post Text Summarization on Gigaword and ROUGE Scoring
you can see the script @pltrdy used for scoring.
It could be a good starting point to see how it is done (in python I think).
But again, community uses 3 scores so it’s a little bit cumbersome.
If you can do this, it would be great.
That would be great! This PR gives you an idea of what is the required to add a validation metric:
I’m not familiar with ROUGE. Can it be reduced to a single score?
@guillaumekln , community uses R1 , R2 and RL : of course they are related R1 -> 1 word , R2->bigram and RL the longest sub-sequence, by experience they are proportional if one increases the other do too , Using AVG of 3 values can be a solution.
There is some other problems like ROUGE is implemented only in PERL and only some python middleware exists , I wonder if it it is possible to do the same with lua .
for now I will use BLUE because it is the closest to ROUGE . During free time I will try to integrate ROUGE and see how it works .