OpenNMT-tf : Error serving model trained with tensorflow-gpu in tensorflow without gpu

tensorflow

(hanky) #1

Hi @guillaumekln . OpenNMT-tf is really good. Training, inferring and serving the model produced using cpu only tensor have no problem. And the result is good.
But when I train my model using tensorflow-gpu, and served exported model using tensorflow_serving in nongpu tensor in shows these errors

In client:

Traceback (most recent call last):
File “/home/hanky/opennmtenv/lib/python3.6/site-packages/grpc/beta/_client_adaptations.py”, line 95, in result
return self._future.result(timeout=timeout)
File “/home/hanky/opennmtenv/lib/python3.6/site-packages/grpc/_channel.py”, line 276, in result
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = “Incomplete graph, missing 1 inputs for seq2seq/decoder_1/decoder_1/GatherTree”
debug_error_string = “{“created”:”@1530697397.381133554",“description”:“Error received from peer”,“file”:“src/core/lib/surface/call.cc”,“file_line”:1083,“grpc_message”:“Incomplete graph, missing 1 inputs for seq2seq/decoder_1/decoder_1/GatherTree”,“grpc_status”:3}"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “nmt_client.py”, line 99, in
main()
File “nmt_client.py”, line 88, in main
result = parse_translation_result(future.result())
File “/home/hanky/opennmtenv/lib/python3.6/site-packages/grpc/beta/_client_adaptations.py”, line 97, in result
raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details=“Incomplete graph, missing 1 inputs for seq2seq/decoder_1/decoder_1/GatherTree”)

and in tensorflow_serving

2018-07-04 15:42:12.827124: I external/org_tensorflow/tensorflow/core/grappler/optimizers/dependency_optimizer.cc:439] Deleted 55 out of 1206 nodes.
2018-07-04 15:42:12.830509: E external/org_tensorflow/tensorflow/core/grappler/optimizers/dependency_optimizer.cc:586] Iteration = 1, topological sort failed with message: Non-existent input for node seq2seq/decoder_1/decoder_1/GatherTree
2018-07-04 15:42:12.831712: I external/org_tensorflow/tensorflow/core/grappler/optimizers/dependency_optimizer.cc:439] Deleted 0 out of 1151 nodes.
2018-07-04 15:42:12.833334: I external/org_tensorflow/tensorflow/core/grappler/optimizers/meta_optimizer.cc:218] dependency_optimizer: Graph size after: 1151 nodes (-55), 1657 edges (-46)
2018-07-04 15:42:12.835938: I external/org_tensorflow/tensorflow/core/grappler/optimizers/meta_optimizer.cc:218] layout: Graph size after: 1151 nodes (0), 1657 edges (0)
2018-07-04 15:42:12.843786: I external/org_tensorflow/tensorflow/core/grappler/optimizers/memory_optimizer.cc:951] Failed to infer memory usage: Node ‘seq2seq/decoder_1/decoder_1/GatherTree’: Unknown input node ‘’



Not found: Op type not registered ‘convert_gradient_to_tensor_cc661786’ in binary running on hanky-desktop. Make sure the Op and Kernel are registered in the binary running in this process.

I do appreciate your help,
hanky


(Guillaume Klein) #2

Hello,

Most likely your TensorFlow Serving version is older than your TensorFlow version, but it should be greater or equal.


(hanky) #3

Hi,
My tensorflow-model-server is 1.8.0, the client tensorflow is 1.4.0. The training machine with gpu runs tensorflow-gpu==1.4.0.
Anything else should I try?


(Guillaume Klein) #4

Can you try downgrading tensorflow-model-server to 1.4.0?


(hanky) #5

Hi guillamekln
It works.
It so kind and helpful of you.
Thank you very much.