Is it possible to obtain confidence of translation per word?

kargintima · December 5, 2019, 2:11pm

Can I get some metrics of translation confidence for each translated words?
translate.py has “-verbose” and it prints “pred_score”, but for the whole sentence.
But I want to be able to highlight some words that has low confidence score (or any other metrics).
I know that during decoding system calculates chance for every second token, so it has to be possible, I think.

guillaumekln · December 10, 2019, 5:11pm

The per-word scores are not saved. You would need to edit the decoding code to return and print them.

Bachstelze · December 10, 2019, 10:19pm

You could score the words with an external language model which is useful for noisy channel modeling, but it is not the score of your translation model.