Issue with TensorFlow V2.0 & EnDe_client.py

tel34 · January 14, 2020, 8:15pm

I had adapted EnDe_client.py (renamed as tf2_client.py) and wrapped it with a Flask server which serves my other language pairs on www.nmtgateway.com. This provided a practical way of serving predictions from TF V2. This was working OK, but after installing the latest version of TensorFlow I am getting a Fatal Python error:

Source: My friend lives in Istanbul (from command line version).
Fatal Python error: Segmentation fault (no prediction delivered)

Thread 0x00007f2d83cb6700 (most recent call first):
File “/home/miguel/tf2_env/lib/python3.5/site-packages/tensorflow_core/python/eager/execute.py”, line 61 in quick_execute
File “/home/miguel/tf2_env/lib/python3.5/site-packages/tensorflow_core/python/eager/function.py”, line 545 in call
File “/home/miguel/tf2_env/lib/python3.5/site-packages/tensorflow_core/python/eager/function.py”, line 1692 in _call_flat
File “/home/miguel/tf2_env/lib/python3.5/site-packages/tensorflow_core/python/saved_model/load.py”, line 99 in _call_flat
File “/home/miguel/tf2_env/lib/python3.5/site-packages/tensorflow_core/python/eager/function.py”, line 1591 in _call_impl
File “/home/miguel/tf2_env/lib/python3.5/site-packages/tensorflow_core/python/eager/function.py”, line 1551 in call
File “./tf2_client.py”, line 21 in translate
File “./tf2_client.py”, line 61 in main
File “./tf2_client.py”, line 67 in
Segmentation fault

The code in line 61 mentioned above is:
device_name = ctx.device_name

pylint: disable=protected-access

try:
ctx.ensure_initialized()
tensors = pywrap_tensorflow.TFE_Py_Execute(ctx._handle, device_name,
op_name, inputs, attrs,
num_outputs)
except core._NotOkStatusException as e:
if name is not None:
message = e.message + " name: " + name

The relevant code snippet in tf2_client.py (EnDe_client.py) is:
def translate(self, texts):
“”“Translates a batch of texts.”""
inputs = self._preprocess(texts)
outputs = self._translate_fn(**inputs) <-----------
return self._postprocess(outputs)

I am puzzled as this was all working fine until I installed the latest version of TensorFlow after noticing that it was not always detecting my GPU.
The GPU is detected by TensorFlow 1.4 OpenNMT-tf V 1.x) which I have running in a separate environment and nvidia-smi shows no issues.

guillaumekln · January 14, 2020, 8:48pm

The latest TensorFlow version (2.1) requires CUDA 10.1. Is everything setup correctly?

If yes, you might need to post a code snippet to reproduce the issue.

tel34 · January 14, 2020, 10:00pm

Ah, CUDA 10.1 is in /usr/local but NOT in $PATH. Will add in the morning and see what happens Thnx for quick response.

tel34 · January 15, 2020, 2:42pm

Hi Guillaume, Putting CUDA 10.1 in the path got me on the right path! However I did remove TensorFlow 2.1 and go back to TensorFlow 2.0 as that seems to have less issues. The ende_client.py changed to my tf_client.py module now works OK without a segmentation fault and will go back into the Flask server. Thanks again from your quick response.