First time training with OpenNMT-tf. I’m using the transformer configuration, and training is happy enough, but when it’s time for eval, it pukes on a line that contains placeholder text (which is seen plenty during training): ᚘ22ᚆ
Traceback:
2018-03-29 15:32:19.109885: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1052] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11432 MB memory) -> physical GPU (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0, compute capability: 5.2)
INFO:tensorflow:Restoring parameters from /data/exp04/models/model.ckpt-9278
INFO:tensorflow:Running local_init_op.
2018-03-29 15:32:19.911414: I tensorflow/core/kernels/lookup_util.cc:362] Table trying to initialize from file /data/exp04/models/sentpiece/dell_en-ja_spm_50k.vocab is already initialized.
INFO:tensorflow:Done running local_init_op.
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/OpenNMT-tf/bin/main.py", line 135, in <module>
main()
File "/root/OpenNMT-tf/bin/main.py", line 116, in main
runner.train_and_evaluate()
File "/root/OpenNMT-tf/opennmt/runner.py", line 138, in train_and_evaluate
tf.estimator.train_and_evaluate(self._estimator, train_spec, eval_spec)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 439, in train_and_evaluate
executor.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 518, in run
self.run_local()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 657, in run_local
eval_result = evaluator.evaluate_and_export()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 847, in evaluate_and_export
hooks=self._eval_spec.hooks)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 418, in evaluate
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 965, in _evaluate_model
config=self._session_config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/evaluation.py", line 212, in _evaluate_once
session.run(eval_ops, feed_dict)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 546, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1022, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1113, in run
raise six.reraise(*original_exc_info)
File "/usr/local/lib/python3.5/dist-packages/six.py", line 693, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1098, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1178, in run
run_metadata=run_metadata))
File "/root/OpenNMT-tf/opennmt/utils/hooks.py", line 134, in after_run
self._model.print_prediction(prediction, stream=output_file)
File "/root/OpenNMT-tf/opennmt/models/sequence_to_sequence.py", line 242, in print_prediction
print_bytes(tf.compat.as_bytes(sentence), stream=stream)
File "/root/OpenNMT-tf/opennmt/utils/misc.py", line 26, in print_bytes
text = str_as_bytes.decode(encoding) if encoding != "ascii" else str_as_bytes
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 8: ordinal not in range(128)
Why do we see the UnicodeDecodeError during eval but not during training?