Hi, The TensorFlow server is giving me four predictions for inference, each with a score. What is the logic of these scores? They do not seem to relate to the quality (human-assessed) of the predictions. Clearly, I need to feed only one of these (the best) into my client software.
The predictions are ordered from best to worst with the score being the average per-token log probability.