Op_kernel.cc:1192 Out of range: Attempted to repeat an empty dataset infinitely

Hi,

I’m trying out the latest tensorflow version (until now was running an older TF one, from a month ago) and I keep on having the same issues with a larger dataset than toy-ende:

2018-02-04 13:11:24.924583: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: Attempted to repeat an empty dataset infinitely.

When I switch back to the older version (a month ago) I don’t get the same error. The dataset I feed is maybe beyond input limits? It’s a 2GB file with 1m samples of max length 512 per sample.
Running the toy-ende works as expected, I only face the issue with other dataset.

Thank you!

Hello,

We had success running trainings on datasets with up to 22M examples so dataset sizes should not be an issue.

What values did you assign to:

  • maximum_features_length
  • maximum_labels_length

in your run configuration?

It seems like all your training examples were filtered out. Also, what TensorFlow version are you using?

Thanks for the quick reply!

In my dataset input length is max 512 and output max is 10.
I tried several config params for length:

maximum_features_length = 512
maximum_labels_length = 10

and
maximum_features_length = 60
maximum_labels_length = 10

Getting same results, with the warning with current version of Open NMT, no issue with 1 month old version.
The TF version is 1.4.1 and I didn’t change it.

Can you share the complete configuration if that’s possible?

EDIT: I get this message when the training files are actually empty.