BLEU alternatives?

Bouncyknighter · February 19, 2023, 9:33pm

Hello,
Looking to validate my trained datasets. I have looked into BLEU but it seems like I have to pay for it. Are there any better alternatives around?
Thanks

SamuelLacombe · February 20, 2023, 3:07am

Hello,

None is perfect, but here few of them:

WER
RED
METEOR

Personally, I like to look at BLEU and WER score.

WER stand for Word rate error and contrary to the other metrics it’s not using ngrams, so it does bring a different insights.

anderleich · February 20, 2023, 12:22pm

I would suggest having a look at this:

Results of WMT22 Metrics Shared Task: Stop Using BLEU – Neural Metrics Are Better and More Robust

I personally use chrF++ and COMET. Additionally BLEU for comparison with other methods.

SamuelLacombe · February 20, 2023, 7:54pm

If i can had some insight… Depending what you are trying to achieve you might choose a different Score.

If you’r trying to see if a certain translation match the style of your current text and not so much focus on the meaning. BLEU Score will be better in that case.

If you want to create a model which is generic and not custom to a specific translator style then BLEU is for sure not the best way to go.