Change output signatures

how can we only export a model with only tokens in signatures?

The following method would have the error when using saved_model_cli.
tensorflow.python.framework.errors_impl.InvalidArgumentError: 'func' argument to TF_GraphCopyFunction cannot be null

checkpoint_dir = "./Downloads/averaged-ende-ckpt500k-v2"
vocabulary = os.path.join(checkpoint_dir, "wmtende.vocab")

model = opennmt.models.TransformerBase()
model.initialize({
    "source_vocabulary": vocabulary,
    "target_vocabulary": vocabulary
})

checkpoint = tf.train.Checkpoint(model=model)
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))

input_signature = {
    name: tf.TensorSpec.from_spec(spec, name=name)
    for name, spec in model.features_inputter.input_signature().items()}

@tf.function(input_signature=(input_signature,))
def tf_predict(features):
    features = model.features_inputter.make_features(features=features.copy())
    _, predictions = model(features)
    return {"tokens": predictions["tokens"]}

export_dir = "<path>"
tf.saved_model.save(model, export_dir, signatures=tf_predict)

How are you running saved_model_cli?

Went to the resave folder and run saved_model_cli show --dir . --all

What are you looking to do? If you just want to verify the input/output nodes, you should replace --all by --tag_set serve --signature_def serving_default

This works. How can I create a serialized example tf record for the model to predict? Since inputs expect tokens and length.
This one won’t work with the following error Expected argument names ['length', 'tokens'] but got values for ['length']. Missing: ['tokens'].

bytes_query = [term.encode('utf-8') for term in query]
 feature = {
        'tokens': _bytes_feature(bytes_query),
        'length': _int64_feature(len(query))
}
example = tf.train.Example(features=tf.train.Features(feature=feature))
example_proto = example.SerializeToString()

I saw the source code for RecordInputter was like.

feature_lists = tf.train.FeatureLists(feature_list={"values": feature_list})
example = tf.train.SequenceExample(feature_lists=feature_lists)
example_proto = example.SerializeToString()

How can I create a serialized example for prediction?

serve_function = model.serve_function().get_concrete_function()
serve_function(example_proto)

There should be no need to serialize the input. You can directly pass tensors as arguments:

serve_function(
    tokens=tf.constant(..., dtype=tf.string),
    length=tf.constant(..., dtype=tf.int32)
)

See this example that is loading a SavedModel to run translations:

Yes. I also saw this file. But we would need to prepare a tfrecord file that another client can use it for predict. Can we achieve that with a serialized tfrecord?

The function expects Tensors.

So if you need to work with serialize protos for some reasons, you need to parse them into tensors before calling the function:

https://www.tensorflow.org/api_docs/python/tf/io/parse_tensor

Got it. Thanks a lot. Another question is whether we can parse only tokens as signatures input.

def make_features(self, element=None, features=None, training=None):
    """Tokenizes raw text."""
    if features is None:
      features = {}
    if "tokens" in features:
      return features

what can we add here to generate length based on tokens? Assume the batch size is always one.

tf.size(features["tokens"]) should be enough.

Thanks a lot. When I check the averaged-ende-ckpt500k-v2, it doesn’t have any .model file. But the saved model has the wmtende.model under assets.extra, which I guess it is generated from export_assets, I am curious where we can get this one.

The following code snippet won’t generate this assets.extra. In addition, where can we find the source code for bpe or sp tokenize function? Are they other open source repo? Thanks

import os
import tensorflow as tf
# import tensorflow_addons as tfa
# tfa.register_all()  # Register custom ops.

import opennmt

checkpoint_dir = "./averaged-ende-ckpt500k-v2"
vocabulary = os.path.join(checkpoint_dir, "wmtende.vocab")

model = opennmt.models.TransformerBase()
# tokenizer = {"type": "SpaceTokenizer"}
model.trainable = False
params = {
    "beam_width": 3,
    "maximum_decoding_length": 15
}
model.initialize({
    "source_vocabulary": vocabulary,
    "target_vocabulary": vocabulary,
    # "source_tokenization": tokenizer,
    # "target_tokenization": tokenizer
}, params=params)

checkpoint = tf.train.Checkpoint(model=model)
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))

input_signature = {
    name: tf.TensorSpec.from_spec(spec, name=name)
    for name, spec in model.features_inputter.input_signature().items()}

@tf.function(input_signature=(input_signature,))
def _run(features):
    features = model.features_inputter.make_features(features=features.copy())
    _, predictions = model(features)
    return {"tokens": predictions["tokens"], "log_probs": predictions["log_probs"]}

export_dir = "./averaged-ende-ckpt500k-v2-resave"
tf.saved_model.save(model, export_dir, signatures=_run)
imported = tf.saved_model.load(export_dir)

This should help: https://opennmt.net/OpenNMT-tf/tokenization.html

Thanks. Does this sound a workable solution if the serving side only expects one text input with SpaceTokenizer in graph?

  1. Server side applies the tokenizer.tokenize() [SentencePiece] to the query and concatenate them with space to form a string.
  2. Send the {‘text’: string} to the model
  3. Receive the {‘text’: string, ‘log_probs’: log_probs} from the model
  4. Split the text by space and applies tokenizer.detokenize() to get the final result

Sure, this works. But the space concatenation and retokenization do not seem useful. Why not sending the tokenized input directly?

Since it expects only one input feature. That’s why I asked the length feature above. Can I ask why we design it to have a length input? Isn’t it just the len(tokens)?

The tokens input is a dense 2D matrix. The length input is necessary to know the actual length of each entry in the batch.

Got it. So if the batch size is one, it doesn’t matter anymore. Is that correct?

Correct.

You could also look into the new tf.RaggedTensor structure which can represent tensors with variable lengths.

Thanks a lot. This is very helpful.