How to use n_best in python serving?

amit01 · December 21, 2021, 1:34pm

Hello,

First of all, thank you very much for the excellent work and making it open-source.
I am using python serving for generating translation as shown in :

github.com

OpenNMT/OpenNMT-tf/blob/master/examples/serving/python/ende_client.py

import argparse
import os

import pyonmttok
import tensorflow as tf


class EnDeTranslator(object):
    def __init__(self, export_dir):
        self._imported = tf.saved_model.load(export_dir)
        self._translate_fn = self._imported.signatures["serving_default"]
        sp_model_path = os.path.join(export_dir, "assets.extra", "wmtende.model")
        self._tokenizer = pyonmttok.Tokenizer("none", sp_model_path=sp_model_path)

    def translate(self, texts):
        """Translates a batch of texts."""
        inputs = self._preprocess(texts)
        outputs = self._translate_fn(**inputs)
        return self._postprocess(outputs)

This file has been truncated. show original

I want to use n_best to generate more than 1 translation. I have used n_best and beam_width parameters in data.yml during training. But could you please tell me how to use n_best in the above mentioned python-serving code to generate more than 1 hypothesis?

Best regards

guillaumekln · December 21, 2021, 1:56pm

Hi,

In the method _postprocess, the code reads the index 0 which corresponds to the best hypothesis. You can access the other hypotheses by changing this index:

github.com

OpenNMT/OpenNMT-tf/blob/v2.24.0/examples/serving/python/ende_client.py#L44

    
      
          
          
        inputs = {
                      "tokens": tf.constant(all_tokens, dtype=tf.string),
                      "length": tf.constant(lengths, dtype=tf.int32),
                  }
                  return inputs
          
          
    def _postprocess(self, outputs):
                  texts = []
                  for tokens, length in zip(outputs["tokens"].numpy(), outputs["length"].numpy()):
                      tokens = tokens[0][: length[0]].tolist()
                      texts.append(self._tokenizer.detokenize(tokens))
                  return texts
          
          

          
def main():
              parser = argparse.ArgumentParser(description="Translation client example")
              parser.add_argument("export_dir", help="Saved model directory")
              args = parser.parse_args()
          
          
    translator = EnDeTranslator(args.export_dir)

Note that for models exported to the SavedModel format, the n_best parameter can only be configured when exporting the model and can no longer be changed afterwards. The value is embedded in the exported graph. The model should be exported again to change the n_best value.

Since you added the ctranslate2 tag, you can also consider exporting your model to CTranslate2 which is faster and more flexible than a TensorFlow-based serving.