Receiving the following error while training my transformer model according to the steps provided on https://github.com/OpenNMT/OpenNMT-tf
The transformer gives the error after every checkpoint is created and the training stops.
The complete traceback is as follows:
Traceback (most recent call last):
File “/home/transformer_model/tfenv/bin/onmt-main”, line 11, in
sys.exit(main())
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/opennmt/bin/main.py”, line 161, in main
runner.train_and_evaluate(checkpoint_path=args.checkpoint_path)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/opennmt/runner.py”, line 227, in train_and_evaluate
tf.estimator.train_and_evaluate(self._estimator, train_spec, eval_spec)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/training.py”, line 471, in train_and_evaluate
return executor.run()
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/training.py”, line 610, in run
return self.run_local()
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/training.py”, line 711, in run_local
saving_listeners=saving_listeners)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py”, line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py”, line 1207, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py”, line 1241, in _train_model_default
saving_listeners)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py”, line 1471, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py”, line 671, in run
run_metadata=run_metadata)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py”, line 1156, in run
run_metadata=run_metadata)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py”, line 1255, in run
raise six.reraise(*original_exc_info)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/six.py”, line 693, in reraise
raise value
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py”, line 1240, in run
return self._sess.run(*args, **kwargs)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py”, line 1320, in run
run_metadata=run_metadata))
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/training/basic_session_run_hooks.py”, line 582, in after_run
if self._save(run_context.session, global_step):
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/training/basic_session_run_hooks.py”, line 607, in _save
if l.after_save(session, step):
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/training.py”, line 517, in after_save
self._evaluate(global_step_value) # updates self.eval_result
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/training.py”, line 537, in _evaluate
self._evaluator.evaluate_and_export())
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/training.py”, line 924, in evaluate_and_export
is_the_final_export)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/training.py”, line 957, in _export_eval_result
is_the_final_export=is_the_final_export))
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/exporter.py”, line 298, in export
full_event_file_pattern)
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/exporter.py”, line 365, in _get_best_eval_result
best_eval_result, event_eval_result):
File “/home/transformer_model/tfenv/lib/python3.6/site-packages/tensorflow/python/estimator/exporter.py”, line 150, in _loss_smaller
‘best_eval_result cannot be empty or no loss is found in it.’)
ValueError: best_eval_result cannot be empty or no loss is found in it.
any insight as to why the error occurs will be very helpful.