OpenNMT Forum

Tensorflow rest api + sentencepiece tf module


(Panos Kanavos) #1


The sample en_de python client in GitHub works fine, but it has too many dependencies to use as a basis for developing a client for calls from remote machines. What I’d like to do is to implement a rest client with the sentencepiece tf module. The sentencepiece repo has a link to sample code which doesn’t work for me, so I would appreciate any pointers and hints.

(Guillaume Klein) #2


Thanks for the question. For the client perspective, what is the difference in using the SentencePiece TF module vs a Python module? Do you mean you want to include the SentencePiece module in the graph?

(Panos Kanavos) #3

I don’t know and haven’t worked with the internals of TF yet, so bare with me :slight_smile: From my understanding, according to sentencepiece’s documentation, using the TF module allows the user to send raw sentences for inference and tokenization could then be performed “server-side” (?). In other words, is there a way to use a simple call with no tokens, just the raw sentence, and then have the tokenization performed on the server, or this requires additional custom layers? I would like to work on the current Trados plugin and add an option to use OpenNMT-tf through tf serving instead of OpenNMT-lua’s REST server.

(Guillaume Klein) #4

This is true if the model is exported using this module (so OpenNMT-tf code has to be modified) and TensorFlow Serving binaries are compiled against the SentencePiece operator.

I know the serving story is still incomplete for OpenNMT-tf (i.e. requires some work) and I have in mind to make the OpenNMT tokenizer a custom TensorFlow op that can be used in serving.

For now, can you check if using the nmt-wizard-docker approach works for you?

(Panos Kanavos) #5

All clear now! So, although I could now use the nmt-wizard-docker, I should probably wait for you to add this feature in the standalone OpenNMT-tf project before I can work on the plugin.
Thanks for your help and all your great work!

(Terence Lewis) #6

I’ll be watching this space :-). I’m a bit behind you but am looking at the same issues!

(Terence Lewis) #7

Hi, I followed the instructions in to the letter and everything gets set up nicely. However when I send off "curl -X POST http://localhost:5000/translate -d ‘{“src”:[{“text”: “Hello world!”}]}’ I get the response from the model server shown below, whatever the source input. Any ideas what’s happening, please?

(Guillaume Klein) #8

Is it with the same pretrained model?

(Terence Lewis) #9

Yes, it is with the pretrained model. I followed the instructions exactly.

(Terence Lewis) #10

Going through the server logs I suspect this is a problem to do with Docker somehow not finding the GPU. I see the message “failed call to cuInit” and then “no NVIDIA GPU device is present:/dev/nvidia0 does not exist”. But of course it does exist and TensorFlow knows about it as I’ve just trained some five TF models with it. I’m guessing here that the gRPC ModelServer is working from the CPU to deliver that phantom output?

(Terence Lewis) #11

Hi @Guillaume,
It now works perfectly with the pretrained model. It was merely a path issue and I’m sorry for wasting your time.